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STARCH ENCAPSULATION 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims priority to provisional patent application serial No. 
5 60/026,855 filed September 30, 1996. Said provisional application is incorporated herein 
by reference to the extent not inconsistent herewith. 

BACKGROUND OF THE INVENTION 
Polysaccharide Enzymes 

Both prokaryotic and eukaryotic cells use polysaccharide enzymes as a storage 
10 reserve. In the prokaryotic cell the primary reserve polysaccharide is glycogen. Although 
glycogen is similar to the starch found in most vascular plants it exhibits different chain 
lengths and degrees of polymerization. In many plants, starch is used as the primary 
reserve polysaccharide. Starch is stored in the various tissues of the starch bearing plant. 
Starch is made of two components in most instances; one is amylose and one is 
15 amylopectin. Amylose is formed as linear glucans and amylopectin is formed as branched 
chains of glucans. Typical starch has a ratio of 25% amylose to 75% amylopectin. 
Variations in the amylose to amylopectin ratio in a plant can effect the properties of the 
starch. Additionally starches from different plants often have different properties. Maize 
starch and potato starch appear to differ due to the presence or absence of phosphate 
20 groups. Certain plants' starch properties differ because of mutations that have been 

introduced into the plant genome. Mutant starches are well known in maize, rice and peas 
and the like. 

The changes in starch branching or in the ratios of the starch components result in 
different starch characteristic. One characteristic of starch is the formation of starch 
25 granules which are formed particularly in leaves, roots, tubers and seeds. These granules 
are formed during the starch synthesis process. Certain synthases of starch, particularly 
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granule-bound starch synthase, soluble starch synthases and branching enzymes are 
proteins that are "encapsulated" within the starch granule when it is formed. 



The use of cDNA clones of animal and bacterial glycogen synthases are described 
in International patent application publication number GB92/01881. The nucleotide and 

5 amino acid sequences of glycogen synthase are known from the literature. For example, 
the nucleotide sequence for the E. coli glgA gene encoding glycogen synthase can be 
retrieved from the GenBank/EMBL (SWISSPROT) database, accession number J02616 
(Kumar et al., 1986, J. Biol. Chem.. 261:16256-16259). E. coli glycogen biosynthetic 
enzyme structural genes were also cloned by Okita et al. (1981. J. Biol. Chem., 

10 256(13):6944-6952). The glycogen synthase glg.^ structural gene was cloned from 

Salmonella typhimurium LT2 by Leung et al. (1987, J. Bacteriol.. 169(9):4349-4354). The 
sequences of glycogen synthase from rabbit skeletal muscle (Zhang et al., 1989, FASEB 
J., 3:2532-2536) and human muscle (Browner et al., 1989. Proc. Natl. Acad. Sci., 86:1443- 
1447) are also known. 

15 The use of cDNA clones of plant soluble starch synthases has been reported. The 

amino acid sequences of pea soluble starch synthase isoforms I and II were published by 
Dry et al. (1991, Plant Journal, 2:193202). The amino acid sequence of rice soluble starch 
synthase was described by Baba et al. (1993, Plant Physiology, ). This last sequence (rice 
SSTS) incorrectly cites the N-terminal sequence and hence is misleading. Presumably this 

20 is because of some extraction error involving a protease degradation or other inherent 
instability in the extracted enzyme. The correct N-terminal sequence (starting with 
AELSR) is present in what they refer to as the transit peptide sequence of the rice SSTS. 

The sequence of maize branching enzyme I was investigated by Baba et al.. 1991, 
BBRC, 181:8794. Starch branching enzyme II from maize endosperm was investigated by 
25 Fisher and Shrable (1993, Plant Physiol.. 102:10451046). The use of cDNA clones of 
plant, bacterial and animal branching enzymes have been reported. The nucleotide and 
amino acid sequences for bacterial branching enzymes (BE) are known from the literature. 
For example, Kiel et al. cloned the branching enzyme gene gIgB from Cyanobacterium 
synechococcussp PCC7942 (1989, Gene (Amst), 78(1):918) and from Bacillus 
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stearothermophiltts (Kiel et al.. 1991. Mol. Gen. Genet.. 230(12): 136-144). The genes 
glc3 and ghal of S. cerevisiae are allelic and encode the glycogen branching en^me 
(Rowen et al.. 1992, Mol. Cell Biol., 12(l):22-29). Matsumomoto et al. investigated 
glycogen branching enzyme from Neurospora crassa (1990, J. Biochem., 107:1 18-122). 
5 The GenBank/EMBL database also contains sequences for the £. coli glgB gene encoding 
branching enzyme. 

Starch synthase (EC 2.4.1.11) elongates starch molecules and is thought to act on 
both amylose and amylopectin. Starch synthase (STS) activity can be found associated 
both with the granule and in the stroma of the plastid. The capacity for starch association 

10 of the bound starch synthase enzyme is well known. Various enzymes involved in starch 

biosynthesis are now known to have differing propensities for binding as described by Mu- 
Forster et al. (1996, Plant Phys. Ill: 821-829). Granule-bound starch synthase (GBSTS) 
activity is strongly correlated with the product of the waxy gene (Shure et al., 1983, Cell 
35: 225-233). The synthesis of amylose in a number of species such as maize, rice and 

15 potato has been shown to depend on the expression of this gene (Tsai, 1974, Biochem 
Gen 11: 83-96; Hovenkamp-Hermelink et al., 1987, Theor. Appl. Gen. 75: 217-221). 
Visser et al. described the molecular cloning and partial characterization of the gene for 
granule-bound starch synthase from potato (1989, Plant Sci. 64(2):185192). Visser et al. 
have also described the inhibition of the expression of the gene for granule-bound starch 

20 synthase in potato by antisense constructs (1991, Mol. Gen. Genet. 225(2):289296). 

The other STS enzymes have become known as soluble starch synthases, following 
the pioneering work of Frydman and Cardini (Frydman and Cardini, 1964, Biochem. 
Biophys. Res. Communications 17: 407-411). Recently, the appropriateness of the term 
"soluble" has become questionable in light of discoveries that these enzymes are 

25 associated with the granule as well as being present in the soluble phase (Denyer et al., 
1993, Plant J. 4: 191-198; Denyer et al., 1995, Planta 97: 57-62; Mu-Forster et al.. 1996, 
Plant Physiol. Ill: 821-829). It is generally believed that the biosynthesis of amylopectm 
involves the interaction of soluble starch synthases and starch branching enzymes. 
Different isoforms of soluble starch synthase have been identified and cloned in pea 

30 (Denyer and Smith. 1992, Planta 186: 609-617; Dry et al., 1992. Plant Journal, 2: 193- 
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202). potato (Edwards et al.. 1995, Plant Physiol 112: 89-97; Marshall et al.. 1996. Plant 
Cell 8: 1 121-1135) and in rice (Baba et al.. 1993. Plant Physiol. 103: 565-573). while 
barley appears to contain multiple isoforms, some of which are associated with starch 
branching enzyme (Tyynela and Schulman, 1994, Physiol. Plantarum 89: 835-841). A 
5 common characteristic of STS clones is the presence of a KXGGLGDV consensus 

sequence which is believed to be the ADP-Glc binding site of the enzyme (Furukawa et 
al., 1990. J Biol Chem 265: 2086-2090; Furukawa et al., 1993. J. Biol. Chem. 268: 23837- 
23842). 

a 

In maize, two soluble forms of STS, known as isoforms I and II, have been 
10 identified (Macdonald and Preiss, 1983, Plant Physiol. 73: 175-178; Boyer and Preiss, 
1978. Carb. Res. 61: 321-334; Pollock and Preiss, 1980. Arch Biochem. Biophys. 204: 
578-588; Macdonald and Preiss, 1985 Plant Physiol. 78: 849-852; Dang and Boyer. 1988, 
Phytochemistry 27: 1255-1259; Mu et al.. 1994. Plant J. 6: 151-159). but neither of these 
has been cloned. STSI activity of maize endosperm was recently correlated with a 76-kDa 
15 polypeptide found in both soluble and granule-associated fractions (Mu et al.. 1994. Plant 
J. 6: 151-159). The polypeptide identity of STSII remains unknown. STSI and II exhibit 
different enzymological characteristics. STSI exhibits primer-independent activity whereas 
STSII requires glycogen primer to catalyze glucosyl transfer. Soluble starch synthases 
have been reported to have a high flux control coefficient for starch deposition (Jenner et 
20 al.. 1993. Aust. J. Plant Physiol. 22: 703-709; Keeling et al., 1993, Planta 191: 342-348) 
and to have unusual kinetic properties at elevated temperatures (Keeling et al.. 1995. Aust. 
J. Plant Physiol. 21 807-827). The respective isoforms in maize exhibit significant 
differences in both temperature optima and stability. 

Plant starch synthase (and E. coU glycogen synthase) sequences include the 
25 sequence KTGGL which is known to be the ADPG binding domain. The genes for any 
such starch synthase protein may be used in constructs according to this invention. 

Branching enzyme [al,4Dglucan: al,4Dglucan 6D(al,4Dglucano) transferase (E.C. 
2.4.1.18)], sometimes called Q-enzyme, converts amylose to amylopectin. A segment of a 
al.4Dglucan chain is transferred to a primary hydroxyl group in a similar glucan cham. 
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Bacterial branching enzyme genes and plant sequences have been reported (rice 
endosperm: Nakamura et.al.. 1992, Physiologia Plantarum, 84:329-335 and Nakamura and 
Yamanouchi. 1992, Plant Physiol., 99:1265-1266; pea: Smith. 1988. Planta, 175:270-279 
and Bhattacharyya et al.. 1989. J. Cell Biochem.. Suppl. 13D:331; maize endosperm: 
5 Singh and Preiss, 1985, Plant Physiology, 79:34-40; VosScherperkeuter et al., 1989. Plant 
Physiology, 90:75-84; potato: Kossmann et al., 1991, Mol. Gen. Genet. 230(12):39-44; 
cassava: Salehuzzaman and Visser, 1992. Plant Mol Biol, 20:809-819). 

In the area of polysaccharide enzymes there are reports of vectors for engineering 
modification in the starch pathway of plants by use of a number of starch synthesis genes 

10 in various plant species. That some of these polysaccharide enzymes bind to cellulose or 
starch or glycogen is well known. One specific patent example of the use of a 
polysaccharide enzyme shows the use of glycogen biosynthesis enzymes to modify plant 
starch. In U.S. patent 5,349,123 to Shewmaker a vector containing DNA to form glycogen 
biosynthetic enzymes within plant cells is taught. Specifically, this patent refers to the 

15 changes in potato starch due to the introduction of these enzymes. Other starch synthesis 
genes and their use have also been reported. 

Hybrid (fusion) Peptides 

Hybrid proteins (also called "fusion proteins") are polypeptide chains that consist of 
two or more proteins fused together into a single polypeptide. Often one of the proteins is 

20 a ligand which binds to a specific receptor cell. Vectors encoding fusion peptides are 

primarily used to produce foreign proteins through fermentation of microbes. The fusion 
proteins produced can then be purified by affinity chromatography. The binding portion of 
one of the polypeptides is used to attach the hybrid polypeptide to an affinity matrix. For 
example, fusion proteins can be formed with beta galactosidase which can be bound to a 

25 column. This method has been used to form viral antigens. 

Another use is to recover one of the polypeptides of the hybrid polypeptide. 
Chemical and biological methods are known for cleaving the fused peptide. Low pH can 
be used to cleave the peptides if an acid-labile aspartyl-proline linkage is employed 
between the peptides and the peptides are not affected by the acid. Hormones have been 
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cleaved with cyanobromide. Additionally, cleavage by site-specific proteolysis has been 
reported. Other methods of protein purification such as ion chromatography have been 
enhanced with the use of polyarginine tails which increase overall basicity of the protein 
thus enhancing binding to ion exchange columns. 

5 A number of patents have outlined improvements in methods of making hybrid 

peptides or specific hybrid peptides targeted for specific uses. US patent 5,635,599 to 
Pastan et al. outlines an improvement of hybrid proteins. This patent reports a circularly 
permuted ligand as part of the hybrid peptide. This ligand possesses specificity and good 
binding affinity. Another improvement in hybrid proteins is reported in U.S. patent 

10 5,648,244 to Kuliopulos. This patent describes a method for producing a hybrid peptide 
with a carrier peptide. This nucleic acid region, when recognized by a restriction 
endonuclease, creates a nonpalindromic 3-base overhang. This allows the vector to be 
cleaved. 

An example of a specifically targeted hybrid protein is reported in U.S. patent 
15 5,643,756. This patent reports a vector for expression of glycosylated proteins in cells. 
This hybrid protein is adapted for use in proper immunoreactivity of HIV gpl20. The 
isolation of gpl20 domains which are highly glycosylated is enhanced by this reported 
vector. 

U.S. patent 5,202,247 and 5,137,819 discuss hybrid proteins having polysaccharide 
20 binding domains and methods and compositions for preparation of hybrid proteins which 
are capable of binding to a polysaccharide matrix. U.S. patent 5,202,247 specifically 
teaches a hybrid protein linking a cellulase binding region to a peptide of interest. The 
patent specifies that the hybrid protein can be purified after expression in a bacterial host 
by affinity chromatography on cellulose. 

25 The development of genetic engineering techniques has made it possible to transfer 

genes from various organisms and plants into other organisms or plants. Although starch 
has been altered by transformation and mutagenesis in the past there is still a need for 
further starch modification. To this end vectors that provide for encapsulation of desired 
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amino acids or peptides within the starch and specifically within the starch granule are 
desirable. The resultant starch is modified and the tissue from the plant carrying the 
vector is modified. 



SUMMARY OF THE INVENTION 



5 This invention provides a hybrid polypeptide comprising a starch-encapsulating . 

region (SER) from a starch-binding enzyme fused to a payload polypeptide which is not 
endogenous to said starch-encapsulating region, i.e. does not naturally occur linked to the 
starch-encapsulating region. The hybrid polypeptide is useful to make modified starches 

comprising the payload polypeptide. Such modified starches may be used to provide grain 
10 feeds enriched in certain amino acids. Such modified starches are also useful for 

providing polypeptides such as hormones and other medicaments, e.g. insulin, in a starch- 
encapsulated form to resist degradation by stomach acids. The hybrid polypeptides are 
also useful for producing the payload polypeptides in easily-purified form. For example, 
such hybrid polypeptides produced by bacterial fermentation, or in grains or animals, may 
15 be isolated and purified from the modified starches with which they are associated by art- 
known techniques. 

The term "polypeptide" as used herein means a plurality of identical or different 
amino acids, and also encompasses proteins. 



The term "hybrid polypeptide" means a polypeptide composed of peptides or 
20 polypeptides from at least two different sources, e.g. a starch-encapsulating region of a 
starch-binding enzyme, fused to another polypeptide such as a hormone, wherein at least 
two component parts of the hybrid polypeptide do not occur fused together in nature. 

The term "payload polypeptide" means a polypeptide not endogenous to the starch- 
encapsulating region whose expression is desired in association with this region to express 
25 a modified starch containing the payload polypeptide. 
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When the payload polypeptide is to be used to enhance the amino acid content of 
particular amino acids in the modified starch, it preferably consists of not more than three 
different types of amino acids selected from the group consisting of: Ala, Arg, Asn, Asp, 
Cys, Gin, Glu, Gly, His, lie, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp. Tyr, and Val. 

5 When the payload polypeptide is to be used to supply a biologically active 

polypeptide to either the host organism or another organism, the payload polypeptide may 
be a biologically active polypeptide such as a hormone, e.g., insulin, a growth factor, e.g. 
somatotropin, an antibody, enzyme, immunoglobulin, or dye, or may be a biologically 
active fragment thereof as is known to the art. So long as the polypeptide has biological 

10 activity, it does not need to be a naturally-occurring polypeptide, but may be mutated, 

truncated, or otherwise modified. Such biologically active polypeptides may be modified 
polypeptides, containing only biologically-active portions of biologically-active 
polypeptides. They may also be amino acid sequences homologous to naturally -occurring 
biologically-active amino acid sequences (preferably at least about 75% homologous) 

15 which retain biological activity. 

The starch-encapsulating region of the hybrid polypeptide may be a starch- 
encapsulating region of any starch-binding enzyme known to the art, e.g. an enzyme 
selected from the group consisting of soluble starch synthase I. soluble starch synthase II, 
soluble starch synthase III, granule-bound starch synthase, branching enzyme I, branching 
, 20 enzyme Ila, branching enzyme IIBb and glucoamylase polypeptides. 

When the hybrid polypeptide is to be used to produce payload polypeptide in pure 
or partially purified form, the hybrid polypeptide preferably comprises a cleavage site 
between the starch-encapsulating region and the payload polypeptide. The method of 
isolating the purified payload polypeptide then includes the step of contacting the hybrid 
25 polypeptide with a cleaving agent specific for that cleavage site. 

This invention also provides recombinant nucleic acid (RNA or DNA) molecules 
encoding the hybrid polypeptides. Such recombinant nucleic acid molecules preferably 
comprise control sequences adapted for expression of the hybrid polypeptide in the 
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selected host. The term "control sequences" includes promoters, introns, preferred codon 
sequences for the particular host organism, and other sequences known to the art to affect 
expression of DNA or RNA in particular hosts. The nucleic acid sequences encoding the 
starch-encapsulating region and the payload polypeptide may be naturally-occurring 
nucleic acid sequences, or biologically-active fragments thereof, or may be biologically- 
active sequences homologous to such sequences, preferably at least about 75% 
homologous to such sequences. 

Host organisms include bacteria, plants, and animals. Preferred hosts are plants. 
Both monocotyledonous plants (monocots) and dicotyledonous plants (dicots) are useful 
hosts for expressing the hybrid polypeptides of this invention. 

This invention also provides expression vectors comprising the nucleic acids 
encoding the hybrid proteins of this invention. These expression vectors are used for 
transforming the nucleic acids into host organisms and may also comprise sequences 
aiding in the expression of the nucleic acids in the host organism. The expression vectors 
may be plasmids, modified viruses, or DNA or RNA molecules, or other vectors useful in 
transformation systems knowTi to the art. 

By the methods of this invention, transformed cells are produced comprising the 
recombinant nucleic acid molecules capable of expressing the hybrid polypeptides of this 
invention. These may prokaryotic or eukaryotic cells from one-celled organisms, plants or 
animals. They may be bacterial cells from which the hybrid polypeptide may be 
harvested. Or, they may be plant cells which may be regenerated into plants from which 
the hybrid polypeptide may be harvested, or, such plant cells may be regenerated into 
fertile plants with seeds containing the nucleic acids encoding the hybrid polypeptide. In a 
preferred embodiment, such seeds contain modified starch comprising the payload 
polypeptide. 

The term "modified starch" means the naturally-occurring starch has been modified 
to comprise the payload polypeptide. 



A method of targeting digestion of a payload polypeptide to a particular phase of 
the digestive process, e.g., preventing degradation of a payload polypeptide in the stomach 
of an animal, is also provided comprising feeding the animal a modified starch of this 
invention comprising the payload polypeptide, whereby the polypeptide is protected by the 
5 starch from degradation in the stomach of the animal. Alternatively, the starch may be 
one known to be digested in the stomach to release the payload polypeptide there. 

Preferred recombinant nucleic acid molecules of this invention comprise DNA 
encoding starch-encapsulating regions sele^feted from the starch synthesizing gene sequences 
set forth in the tables hereof. 

10 Preferred plasmids of this invention are adapted for use with specific hosts. 

Plasmids comprising a promoter, a plastid-targeting sequence, a nucleic acid sequence 
encoding a starch-encapsulating region, and a terminator sequence, are provided herein. 
Such plasmids are suitable for insertion of DNA sequences encoding payload polypeptides 
and starch-encapsulating regions for expression in selected hosts. 

15 Plasmids of this invention can optionally include a spacer or a linker unit 

proximate the fusion site between nucleic acids encoding the SER and the nucleic acids 
encoding the payload polypeptide. This invention includes plasmids comprising promoters 
adapted for a prokaryotic or eukaryotic hosts. Such promoters may also be specifically 
adapted for expression in monocots or in dicots. 

20 A method of forming peptide-modified starch of this invention includes the steps 

of: supplying a plasmid having a promoter associated with a nucleic acid sequence 
encoding a starch-encapsulating region, the nucleic acid sequence encoding the starch- 
encapsulating region being connected to a nucleic acid region encoding a payload 
polypeptide, and transforming a host with the plasmid whereby the host expresses peptide- 

25 modified starch. 

This invention furthermore comprises starch-bearing grains comprising: an embryo, 
nutritive tissues; and, modified starch granules having encapsulated therein a protein that is 



10 



not endogenous to starch granules of said grain which are not modified. Such starch- ~ 
bearing grains may be grains wherein the embryo is a rnaize embryo, a rice embryo, or a 
wheat embryo. 

All publications referred to herein are incorporated by reference to the extent not. 
5 inconsistent herewith. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. la shows the plasmid pEXS114 which contains the synthetic GFP (Green 
Fluorescent Protein) subcloned into pBSK from Stratagene. 

FIG- lb shows the plasmid pEXS 11 5. 

10 FIG. 2a. shows the wary gene with restriction sites subcloned into a 

commercially available plasmid. 

FIG. 2b shows the p ET-2IA plasmid commercially available from Novagen 
having the GFP fragment from pEXSl 15 subcloned therein. 

FIG. 3a shows pEXSIU subcloned into pEXSWX, and the GFP-FLWX map. 

15 FIG. 3b shows the GFP-Bam fflWX plasmid. 

FIG. 4 shows the SGFP fragment of pEXS115 subcloned into pEXSWX, and the 
GFP-NcoWX map. 

FIG. 5 shows a linear depiction of a plasmid that is adapted for use in monocots. 
FIG. 6 shows the plasmid pEXS52. 
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FIG. 7 shows the six introductory plasmids used to form pEXSSl and pEX560. 
FIG. 7a shows pEXS adhl. FIG. 7b shows pEXS adhl-nos3'. FIG. 7c shows pEXS33. 
FIG. 7d shows pEXSlOzp. FIG. 7e shows pEXSlOzp-adhl. FIG. 7f shows pEXSlOzp- 
adhl-nos3*. 

5 FIGS. 8a and 8b show the plasmids pEXSSO and pEXSSl. respectively, containing 

the MS-Sni gene which is a starch-soluble synthase gene. 

FIG. 9a shows the plasmid pEXS60 which excludes the intron shown in pEXSSO, 
and FIG, 9b shows the plasmid pEXS61 which excludes the intron shown in pEXS60. 

DETAILED DESCRIPTION 

10 The present invention provides, broadly, a hybrid polypeptide, a method for making 

a hybrid polypeptide, and nucleic acids encoding the hybrid polypeptide. A hybrid 
polypeptide consists of two or more subparts fused together into a single peptide chain. 
The subparts can be amino acids or peptides or polypeptides. One of the subparts is a 
starch-encapsulating region. Hybrid polypeptides may thus be targeted into starch granules 

15 produced by organisms expressing the hybrid polypeptides. 

A method of making the hybrid polypeptides within cells involves the preparation 
of a DNA construct comprising at least a fragment of DNA encoding a sequence which 
functions to bind the expression product of attached DNA into a granule of starch, ligated 
to a DNA sequence encoding the polypeptide of interest (the payload polypeptide). This 
20 construct is expressed within a eukaryotic or prokaryotic cell. The hybrid polypeptide can 
be used to produce purified protein or to immobilize a protein of interest within the 
protection of a starch granule, or to produce grain that contains foreign amino acids or 
peptides. 



The hybrid polypeptide according to the present invention has three regions. 



Payload Peptide 


Central Site 


Starch-encapsulating 


(X) 


(CS)* 


region (SER) 
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X is any amino acid or peptide of interest. 
* optional component. 

The gene for X can be placed in the 5* or 3' position within the DNA construct 
described below, 

5 CS is a central site which may be a leaving site, a cleavage site, or a spacer, as is 

known to the art. A cleavage site is recognized by a cleaving enzyme. A cleaving 
enzyme is an enzyme that cleaves peptides at a particular site. Examples of chemicals and 
enzymes that have been employed to cleave polypeptides include thrombin, trypsin, 
cyanobromide, formic acid, hydroxyl amine, collagenase, and alasubtilisin. A spacer is a 
10 peptide that joins the peptides comprising the hybrid polypeptide. Usually it does not have 
any specific activity other than to join the peptides or to preserve some minimum distance 
or to influence the folding, charge or water acceptance of the protein. Spacers may be any 
peptide sequences not interfering with the biological activity of the hybrid polypeptide. 

The starch-encapsulating region (SER) is the region of the subject polypeptide that 
15 has a binding affinity for starch. Usually the SER is selected from the group consisting of 
peptides comprising starch-binding regions of starch synthases and branching enzymes of 
plants, but can include starch binding domains from other sources such as glucoamylase 
and the like. In the preferred embodiments of the invention, the SER includes peptide 
products of genes that naturally occur in the starch synthesis pathway. This subset of 
20 preferred SERs is defined as starch-forming encapsulating regions (SFER). A further 

subset of SERs preferred herein is the specific starch-encapsulating regions (SSER) from 
the specific enzymes starch synthase (STS), granule-bound starch synthase (GBSTS) and 
branching enzymes (BE) of starch-bearing plants. The most preferred gene product from 
this set is the GBSTS. Additionally, starch synthase I and branching enzyme II are useful 
25 gene products. Preferably, the SER (and all the subsets discussed above) are truncated 

versions of the full length starch synthesizing enzyme gene such that the truncated portion 
includes the starch-encapsulating region. 
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The DNA construct for expressing the hybrid polypeptide within the host, broadly 
is as follows: 



Promoter 


Intron* 


Transit Peptide 


X 


SER 


Terminator 






Coding Region* 









* optional component. Other optional components can also be used. 



As is known to the art, a promoter is a region of DNA controlling transcription. 
Different types of promoters are selected "for different hosts. Lac and T7 promoters work 
well in prokaiyotes, the 35S CaMV promoter works well in dicots, and the polyubiquitin 
promoter works well in many monocots. Any number of different promoters are known to 
the art and can be used within the scope of this invention. 

Also as is known to the art, an intron is a nucleotide sequence in a gene that does 
not code for the gene product. One example of an intron that often increases expression 
in monocots is the Adhl intron. This component of the construct is optional. 

The transit peptide coding region is a nucleotide sequence that encodes for the 
translocation of the protein into organelles such as plastids. It is preferred to choose a 
transit peptide that is recognized and compatible with the host in which the transit peptide 
is employed. In this invention the plastid of choice is the amyloplast. 

It is preferred that the hybrid polypeptide be located v^nthin the amyloplast in cells 
such as plant cells which synthesize and store starch in amyloplasts. If the host is a 
bacterial or other cell that does not contain an amyloplast, there need not be a transit 
peptide coding region. 

A terminator is a DNA sequence that terminates the transcription. 

X is the coding region for the payload polypeptide, which may be any polypeptide 
of interest, or chains of amino acids. It may have up to an entire sequence of a known 
polypeptide or comprise a useful fragment thereof The payload polypeptide may be a 
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polypeptide, a fragment thereof, or biologically active protein which is an enzyme, 
hormone, growth factor, immunoglobulin, dye, etc. Examples of some of the payload 
polypeptides that can be employed in this invention include, but are not limited to, 
prolactin (PRL), serum albumin, growth factors and growth hormones, i.e., somatotropin. 

5 Serum albumins include bovine, ovine, equine, avian and human serum albumin. Growth 
factors include epidermal growth factor (EGF), insulin-like growth factor I (IGF-I), insulin- 
like growth factor 11 (IGF-II), fibroblast growth factor (FGF), transforming growth factor 
alpha (TGF-alpha), transforming growth factor beta (TGF-beta), nerve growth factor 
(NGF), platelet-derived growth factor (PDGF), and recombinant human insulin-like growth 

10 factors I (rHuIGF-I) and II (rHuIGF-II). Somatotropins which can be employed to practice 
this invention include, but are not limited to, bovine, porcine, ovine, equine, avian and 
human somatotropin. Porcine somatotropin includes delta-7 recombinant porcine 
somatotropin, as described and claimed in European Patent Application Publication No. 
104,920 (Biogen). Preferred payload polypeptides are somatotropin, insulin A and B 

15 chains, calcitonin, beta endorphin, urogasrrone, beta globin, myoglobin, human growth 
hormone, angiotensin, proline, proteases, beta-galactosidase, and cellulases. 

The hybrid polypeptide, the SER region and the payload polypeptides may also 
include post-translational modifications known to the art such as glycosylation, acylation, 
and other modifications not interfering with the desired activity of the polypeptide. 



20 Developing a Hybrid polypeptide 

The SER region is present in genes involved in starch synthesis. Methods for 
isolating such genes include screening from genomic DNA libraries and from cDNA 
libraries. Genes can be cut and changed by ligation, mutation agents, digestion, restriction 
and other such procedures, e.g., as outlined in Maniatis et al.. Molecular Cloning, Cold 

25 Spring Harbor Labs, Cold Spring Harbor, N.Y. Examples of excellent starting materials 
for accessing the SER region include, but are not limited to, the following: starch 
synthases I, H, III, IV, Branching Enzymes I, IIA and B and granule-bound starch synthase 
(GBSTS), These genes are present in starch-bearing plants such as rice, maize, peas, 
potatoes, wheat, and the like. Use of a probe of SER made from genomic DNA or cDNA 

30 or mRNA or antibodies raised against the SER allows for the isolation and identification 
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of useful genes for cloning. The starch enzyme-encoding sequences may be modified as 
long as the modifications do not interfere with the ability of the SER region to encapsulate 
associated polypeptides. 



When genes encoding proteins that are encapsulated into the starch granule are 
5 located, then several approaches to isolation of the SER can be employed, as is known to 
the art. One method is to cut the gene with restriction enzymes at various sites, deleting 
sections from the N-terminal end and allowing the resultant protein to express. The 
expressed truncated protein is then run on "a starch gel to evaluate the association and 
dissociation constant of the remaining protein. Marker genes known to the art, e.g., green 
10 fluorescent protein gene, may be attached to the truncated protein and used to determine 
the presence of the marker gene in the starch granule. 

Once the SER gene sequence region is isolated it can be used in making the gene 
fragment sequence that will express the payload polypeptide encapsulated in starch. The 
SER gene sequence and the gene sequence encoding the payload polypeptide can be 
15 ligated together. The resulting fused DNA can then be placed in a number of vector 

constructs for expression in a number of hosts. The preferred hosts form starch granules 
in plastids, but the testing of the SER can be readily performed in bacterial hosts such as 
Kcoli. 

The nucleic acid sequence coding for the payload polypeptide may be derived from 
20 DNA, RNA, genomic DNA, cDNA, mRNA or may be synthesized in whole or in part. 
The sequence of the payload polypeptide can be manipulated to contain mutations such 
that the protein produced is a novel, mutant protein, so long as biological function is 
maintained. 

When the payload polypeptide-encoding nucleic acid sequence is ligated onto the 
25 SER-encoding sequence, the gene sequence for the payload polypeptide is preferably 
attached at the end of the SER sequence coding for the N-terminus. Although the N- 
terminus end is preferred, it does not appear critical to the invention whether the payload 
polypeptide is ligated onto the N-terminus end or the C-terminus end of the SER. Clearly, 
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the method of forming the recombinant nucleic acid molecules of this invention, whether 
synthetically, or by cloning and ligation, is not critical to the present invention. 

The central region of the hybrid polypeptide is optional. For some applications of 
the present invention it can be very useful to introduce DNA coding for a convenient 
5 protease cleavage site in this region into the recombinant nucleic acid molecule used to 
express the hybrid polypeptide. Alternatively, it can be useful to introduce DNA coding 
for an amino acid sequence that is pH-sensitive to form the central region. If the use of 
the present invention is to develop a pure:protein that can be extracted and released from 
the starch granule by a protease or the like, then a protease cleavage site is useful. 
10 Additionally, if the protein is to be digested in an animal then a protease cleavage site 
may be useful to assist the enzymes in the digestive tract of the animal to release the 
protein from the starch. In other applications and in many digestive uses the cleavage site 
would be superfluous. 

The central region site may comprise a spacer. A spacer refers to a peptide that 
15 joins the proteins comprising a hybrid polypeptide. Usually it does not have any specific 
activity other than to join the proteins, to preserve some minimum distance, to influence 
the folding, charge or hydrophobic or hydrophilic nature of the hybrid polypeptide. 

Construct Development 

Once the ligated DNA which encodes the hybrid polypeptide is formed, then 

20 cloning vectors or plasmids are prepared which are capable of transferring the DNA to a 
host for expressing the hybrid polypeptides. The recombinant nucleic acid sequence of 
this invention is inserted into a convenient cloning vector or plasmid. For the present 
invention the preferred host is a starch granule-producing host. However, bacterial hosts 
can also be employed. Especially useful are bacterial hosts that have been transformed to 

25 contain some or all of the starch-synthesizing genes of a plant. The ordinarily skilled 
person in the art understands that the plasmid is tailored to the host. For example, in a 
bacterial host transcriptional regulatory promoters include lac, TAC. trp and the like. 
Additionally, DNA coding for a transit peptide most likely would not be used and a 
secretory leader that is upstream from the structural gene may be used to get the 
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polypeptide into the medium. Alternatively, the product is retained in the host and the 
host is lysed and the product isolated and purified by starch extraction methods or by 
binding the material to a starch matrix (or a starch-like matrix such as amylose or 
amylopectin, glycogen or the like) to extract the product. 

5 The preferred host is a plant and thus the preferred plasmid is adapted to be useful 

in a plant. The plasmid should contain a promoter, preferably a promoter adapted to 
target the expression of the protein in the starch-containing tissue of the plant, the 
promoter may be specific for various tissues such as seeds, roots, tubers and the like; or, it 
can be a constitutive promoter for gene expression throughout the tissues of the plant. 

10 Well-known promoters include the 10 kD zein (maize) promoter, the CAB promoter, 
patastin, 35S and 19S cauliflower mosaic virus promoters (very useful in dicots), the 
polyubiquitin promoter (useful in monocots) and enhancements and modifications thereof 
known to the art. 

The cloning vector may contain coding sequences for a transit peptide to direct the 
15 plasmid into the correct location. Examples of transit peptide-coding sequences are shown 
in the sequence tables. Coding sequences for other transit peptides can be used. Transit 
peptides naturally occurring in the host to be used are preferred. Preferred transit peptide 
coding regions for maize are shown in the tables and figures hereof The purpose of the 
transit peptide is to target the vector to the correct intracellular area, 

20 Attached to the transit peptide-encoding sequence is the DNA sequence encoding 

the N-terminal end of the payload polypeptide. The direction of the sequence encoding 
the payload polypeptide is varied depending on whether sense or antisense transcription is 
desired. DNA constructs of this invention specifically described herein have the sequence 
encoding the payload polypeptide at the N- terminus end but the SER coding region can 

25 also be at the N-terminus end and the payload polypeptide sequence following. At the end 
of the DNA construct is the terminator sequence. Such sequences are well known in the 
art. 
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The cloning vector is transformed into a host. Introduction of the cloning vector, 
preferably a plasmid, into the host can be done by a number of transformation techniques 
known to the art. These techniques may vary by host but they include microparticle 
bombardment, micro injection, Agrobacterium transformation, "whiskers" technology (U.S. 

5 Patent Nos. 5,302,523 and 5,464,765), el ectropo ration and the like. If die host is a plant, 
the cells can be regenerated to form plants. Methods of regenerating plants are known in 
the art. Once the host is transformed and the proteins expressed therein, the presence of 
the DNA encoding die payload polypeptide in the host is confjrn\able. The presence of 
expressed proteins may be confirmed by Western Blot or ELISA or as a result of a change 

10 in the plant or the cell. 

Uses of Encapsulated Protein 

There are a number of applications of this invention. The hybrid polypeptide can 
be cleaved in a pure state from the starch (cleavage sites can be included) and pure protein 
can be recovered. Alternatively, the encapsulated payload polypeptide within the starch 

15 can be used in raw form to deliver protein to various parts of the digestive tract of the 

consuming animal ("animal" shall include mammals, birds and fish). For example if the 
starch in which the material is encapsulated is resistant to digestion then the protein will 
be released slowly into the intestine of the animal, therefore avoiding degradation of the 
valuable protein in the stomach. Amino acids such as. methionine and lysine may be 

20 encapsulated to be incorporated directly into the grain that the animal is fed thus 

eliminating the need for supplementing the diet with these amino acids in other forms. 

The present invention allows hormones, enzymes, proteins, proteinaceous nutrients 
and proteinaceous medicines to be targeted to specific digestive areas in the digestive 
tracts of animals. Proteins that normally are digested in the upper digestive tract 
25 encapsulated in starch are able to pass through the stomach in a nondigested manner and 
be absorbed intact or in part by the intestine. If capable of passing through the intestinal 
wall, the payload polypeptides can be used for medicating an animal, or providmg 
hormones such as growth factors, e.g., somatotropin, for vaccination of an animal or for 
enhancing the nutrients available to an animal. 
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If the starch used is not resistant to digestion in the stomach (for example the 
sugary 2 starch is highly digestible), then the added protein can be targeted to be absorbed 
in the upper digestive tract of the animal. This would require that the host used to 
produce the modified starch be mutated or transformed to make sugary 2 type starch. The 

5 present invention encompasses the use of mutant organisms that form modified starch as 
hosts. Some examples of these mutant hosts include rice and maize and the like having 
sugary 1, sugary 2, brittle, shrunken, waxy, amylose extender, dull, opaque, and floury 
mutations, and the like. These mutant starches and starches from different plant sources 
have different levels of digestibility. Thus by selection of the host for expression of the 

10 DNA and of the animal to which the modified starch is fed, the hybrid polypeptide can be 
digested where it is targeted. Different proteins are absorbed most efficiently by different 
parts of the body. By encapsulating the protein in starch that has the selected digestibility, 
the protein can be supplied anywhere throughout the digestive tract and at specific times 
during the digestive process. 

15 Another of the advantages of the present invention is the ability to inhibit or 

express differing levels of glycosylation of the desired polypeptide. The encapsulating 
procedure may allow the protein to be expressed within the granule in a different 
glycosylation state than if expressed by other DNA molecules. The glycosylation will 
depend on the amount of encapsulation, the host employed and the sequence of the 

20 polypeptide. 

Improved crops having the above-described characteristics may be produced by 
genetic manipulation of plants known to possess other favorable characteristics. By 
manipulating the nucleotide sequence of a starch-synthesizing enzyme gene, it is possible 
to alter the amount of key amino acids, proteins or peptides produced in a plant. One or 
25 more genetically engineered gene constructs, which may be of plant, fungal, bacterial or 
animal origin, may be incorporated into the plant genome by sexual crossing or by 
transformation. Engineered genes may comprise additional copies of wildtype genes or 
may encode modified or allelic or alternative enzymes with new properties. Incorporation 
of such gene construct(s) may have varying effects depending on the amount and type of 
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gene(s) introduced (in a sense or antisense orientation). It may increase the plant's 
capacity to produce a specific protein, peptide or provide an improved amino acid balance. 

Cloning Enzymes Involved in Starch Biosynthesis 

Known cloning techniques may be used to provide the DNA constructs of this 

5 invention. The source of the special forms of the SSTS, GBSTS, BE, glycogen synthase 
(GS), amylopectin, or other genes used herein may be any organism that can make starch 
or glycogen. Potential donor organisms are screened and identified. Thereafter there can 
be two approaches: (a) using enzyme purification and antibody/sequence generation 
following the protocols described herein; (b) using SSTS, GBSTS, BE, GS, amylopectin or 

10 other cDNAs as heterologous probes to identify the genomic DNAs for SSTS, GBSTS, 

BE, GS, amylopectin or other starch-encapsulating enzymes in libraries from the organism 
concerned. Gene transformation, plant regeneration and testing protocols are known to the 
art. In this instance it is necessary to make gene constructs for transformation which 
contain regulatory sequences that ensure expression during starch formation. These 

15 regulatory sequences are present in many small grains and in tubers and roots. For 

example these regulatory sequences are readily available in the maize endosperm in DNA 
encoding Granule Bound Starch Synthesis (GBSTS), Soluble Starch Synthases (SSTS) or 
Branching Enzymes (BE) or other maize endosperm starch synthesis pathway enzymes. 
These regulatory sequences from the endosperm ensure protein expression at the correct 

20 developmental time (e.g., ADPG pyrophosphorylase). 

In this method we measure starch-binding constants of starch-binding proteins 
using native protein electrophoresis in the presence of suitable concentrations of 
carbohydrates such as glycogen or amylopectin. Starch-encapsulating regions can be 
elucidated using site-directed mutagenesis and other genetic engineering methods known to 
25 those skilled in the art. Novel genetically-engineered proteins carrying novel peptides or 
amino acid combinations can be evaluated using the methods described herein. 
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EXAMPLES 

Example One; 

Method for Identification of Starch-encapsulating Proteins 



Starch-Granule Protein Isolation: 

5 Homogenize 12.5 g grain in 25 ml Extraction buffer (50 mM Tris acetate, pH 7.5, 

1 mM EDTA, 1 mM DTT for 3 x 20 seconds in Waring blender with 1 min intervals 
between blending). Keep samples on ice. Filter dirough mira cloth and centrifuge at 6,000 
rpm for 30 min. Discard supernatant and'scrape off discolored solids which overlay white 
starch pellet. Resuspend pellet in 25 ml buffer and recentrifuge. Repeat washes twice 

10 more. Resuspend washed pellet in -ZC'C acetone, allow pellet to settle at -20°C. Repeat. 
Dry starch under stream of air. Store at -20'*C. 



Protein Extraction: 

Mix 50 mg starch with 1 ml 2% SDS in eppendorf. Vortex, spin at 18,000 rpm, 5 
min, 4*^C. Pour off supernatant. Repeat twice. Add 1 ml sample buffer (4 ml distilled 
15 water, 1 ml 0.5 M Tris-HCl, pH 6.8, 0.8 ml glycerol, 1.6 ml 10% SDS, 0.4 ml B- 

mercaptoethanol, 0.2 ml 0.5% bromphenol blue). Boil eppendorf for 10 min with hole in 
lid. Cool, centrifuge 10,000 rpm for 10 min. Decant supernatant into new eppendorf. Boil 
for 4 minutes with standards. Cool. 



SDS-Page Gels: (non-denaturing) 

20 10% Resolve 4% Stack 

Acryl/Bis 40% stock 2.5 ml 1.0 ml 

1.5 M Tris pH 8.8 2.5 ml 

0.5 M Tris pH 8.8 - 2.5 ml 

10% SDS 100 nl 100 |il 

25 Water 4.845 ml 6.34 ml 

Degas 15 min add fresh 



10% Ammonium Persulfate 50 ^il 50 |il 

TEMED 5^il 10 nl 
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Mini-Protean II Dual Slab Cell; 3.5 ml of Resolve buffer per gel. 4% Stack is poured on 
top. The gel is run at 200V constant voltage. 10 x Running buffer (250 mM Tris, 1.92 M 
glycine, 1% SDS. pH 8.3). 



Method of Measurement of Starch-Encapsulating Re2ions: 
5 Solutions: 



10 



15 



20 



25 



Extraction Buffer: 

Stacking Buffer: 

Resolve Buffer: 

10 X Lower Electrode Buffer: 

Upper Electrode Buffer: 

Sucrose Solution: 

30% Acryl/Bis Stock (2.67%C): 



15% Acryl/Bis Stock (20% C): 



Riboflavin Solution: 



SS Assay mix: 



Iodine Solution: 



50 mM Tris-acetate pH 7.5. 10 mM EDTA. 10% 
sucrose, 2.5 mM DTT-fresh. 
0.5 M Tris-HCl, pH 6.8 
1.5 MTris-HCl. pH 8.8 

30.3 g Tris + 144 g Glycine qs to 1 L. (pH is -8.3, no 
adjustment). Dilute for use. 
Same as Lov/er 

18.66 g sucrose -i- 100 ml dH20 

146 g acrylamide + 4 g bis + 350 ml dH20. Bring up 
to 500 ml. Filter and store at 4 C in the dark for up 

to I month. 

6 g acrylamide + 1.5 g bis + 25 ml dH20. Bring up 
to 50 ml. Filter and store at 4 C in the dark for up to 

1 month. 

1.4 g riboflavin + 100 ml dH20. Store in dark for up 
to 1 month. 

25 mM Sodium Citrate, 25 mM Bicine-NaOH (pH 
8.0), 2 mM EDTA, 1 mM DTT-fresh, 1 mM 
Adenosine 5' Diphosphoglucose-fresh, 10 mg/ml rabbit 
liver glycogen Type Ill-fresh. 

2 g iodine + 20 g KI, O.l N HCl up to 1 L. 
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Extract: 

4 ml extraction buffer + 12 g endosperm. Homogenize. 

filter through mira cloth or 4 layers cheesecloth, spin 20,000 g (14,500 rpm, SM-24 
rotor), 20 min., 4''C. 
5 • remove supernatant using a glass pipette. 

0.85 ml extract +0.1 ml glycerol + 0.05 ml 0.5% bromophenol blue, 
vortex and spin 5 min. full speed microfuge. Use directly or freeze in liquid 
nitrogen and store at -80°C for up to 2 weeks. 

Cast Gels: 

10 Attach Gel Bond PAG film (FMC Industries, Rockland, ME) to (inside of) outer 

glass plate using two-sided scotch tape, hydrophilic side up. The tape and the film is 
lined up as closely and evenly as possible with the bottom of the plate. The film is 
slightly smaller than the plate. Squirt water between the film and the plate to adhere the 
film. Use a tissue to push out excess water. Set up plates as usual, then seal the bottom 

15 of the plates with tacky adhesive. The cassette will fit into the casting stand if the gray 
rubber is removed from the casting stand. The gel polymerizes with the film, and stays 
attached during all subsequent manipulations. 

Cast 4.5% T resolve mini-gel (0.75 mm): 
2.25 ml dH20 
20 + 3.75 ml sucrose solution 

+ 2.5 ml resolve buffer 
+ 1.5 ml 30% Acryl/Bis stock 

+ various amounts of glycogen for each gel (i.e., 0 - 1.0%) 
DEGAS 15 MIN. 
25 + 50 ^1 10% APS 

+ 5 |il TEMED 

POLYMERIZE FOR 30 NflN. OR OVERNIGHT 

Cast 3.125 % T stack: 
1.59 ml dH.O 
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+ 3.75 ml sucrose solution 
+ 2.5 ml stack buffer 
+ 2.083 ml 15% Acryl/Bis stock 
DO NOT DEGAS 
5 15 jil 10% APS 

+ 35 ^il riboflavin solution 
+ 30 jil TEMED 

POLYMERIZE FOR 2.5 HOURS CLOSE TO A LIGHT BULB 
cool in 4*C before pulling out com'bs. Can also not use combs, and just 
10 cast a centimeter of stacker. 



The foregoing procedure: 

Can run at different temperatures; preincubate gels and solutions. 
Pre-run for 15 min. at 200 V 
Load gel: 7 \il per well, or 115 pil if no comb. 
15 . Run at 140 V until dye front is close to bottom. Various running temperatures are 
achieved by placing the whole gel rig into a water bath. Can occasionally stop the 
run to insert a temperature probe into the gel. 

Enzyme assay: Cut gels off at dye front. Incubate in SS. Assay mix overnight at 
room temperature with gentle shaking. Rinse gels with water. Flood with I2/KI 
20 solution. 

Take pictures of the gels on a light box, and measure the pictures. Rm = mm from 
top of gel to the active band/mm from top of gel to the bottom of the gel where it 
was cut (where the dye front was). Plot % glycogen vs. 1/Rm. The point where 
the line intersects the x axis is -K (where y=0). 

25 Testing and evaluation protocol for SER region length: 

Following the procedure above for selection of the SER region requires four basic 
steps. First DNA encoding a protein having a starch-encapsulation region must be 
selected. This can be selected from known starch-synthesizing genes or starch-binding 
genes such as genes for amylases, for example. The protein must be extracted. A number 

30 of protein extraction techniques are well known in the art. The protein may be treated 



with proteases to form protein fragments of different lengths. The preferred fragments , 
have deletions primarily from the N-terminus region of the protein. The SER region is 
located nearer to the C-terminus end than the N-terminus end. The protein is run on the 
gels described above and affinity for the gel matrix is evaluated. Higher affinity shows 
5 more preference of that region of the protein for the matrix. This method enables 

comparison of different proteins to identify the starch-encapsulating regions in natural or 
synthetic proteins. 

Example Two; \ 
SER Fusion Vector: 

10 The following fusion vectors are adapted for use in £.co//. The fusion gene that 

was attached to the probable SER in these vectors encoded for the green fluorescent 
protein (GFP). Any number of different genes encoding for proteins and polypeptides 
could be ligated into the vectors. A fusion vector was constructed having the SER of 
waxy maize fused to a second gene or gene fragment, in this case GFP. 

15 pEXSl 14 (see FIG. la): Synthetic GFP (SGFP) was PCR^ampIified from the 

plasmid HBT-SGFP (from Jen Sheen; Dept. of Molecular Biology; Wellman 1 1, MGH; 
Boston, MA 02114) using the primers EXS73 (5'-GACTAGTCATATG GTG AGC AAG 
GGC GAG GAG-3') [SEQ ID NO: I] and EXS74 (5'-CTAGATCTTCATATG CTT GTA 
CAG CTC GTC CAT GCC.3') [SEQ ID N0:2]. The ends of the PGR product were 

20 polished off with T DNA polymerase to generate blunt ends; then the PGR product was 
digested with Spe I. This SGFP fragment was subcloned into the EcoKV-Spe I sites of 
pBSK (Stratagene at 1101 1 North Torrey Pines Rd. La Jolla, Ca.) to generate pEXSl 14. 

pEXSllS [see FIG. lb]: Synthetic GFP (SGFP) was PCR-amplified from the 
plasmid HBT-SGFP (from Jen Sheen) using the primers EXS73 (see above) and EXS75 
25 (S'-CTAGATCTTGGCCATGGC CTT GTA CAG CTC GTC CAT GCC-3') [SEQ ID 
N0:3]. The ends of the PCR product were polished off with T DNA polymerase to 
generate blunt ends; then the PCR product was digested with Spe I. This SGFP fragment 
was subcloned into the EcoKV-Spe I sites of pBSK (Stratagene) generating pEXSl 15. 
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pEXSWX (see FIG. 2a): Maize WX subcloned NdeVNot I into pET-21a (see FIG. 
2b), The genomic DNA sequence and associated amino acids from which the mRNA 
sequence can be generated is shown in TABLES la and lb below and alternatively the 

DNA listed in the following tables could be employed. 

TABLE la 

DNA Sequence and Deduced Amino Acid Sequence 
of the yvaxv Gene in Maize 
[SEP ID NO: 4 and SEP ID N0:51 
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LOCUS 

DEFINITION 

ACCESSION 
KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
STANDARD 

COMMENT 

FEATURES 

source 



repeater eg ion 
repeat^region 
r epe a t_r eg ion 
repeat_region 
misc feature 



site) 



ZMWAXY 4300 bp DNA PLN 

Zea mays waxy (wx+) locus for UDP-glucose starch glycosyl 

transferase. 

X03935 M242S8 

glycosyl transferase; transit peptide; 
UDP-glucose starch glycosyl transferase; waxy locus, 
maize . 
Zea mays 

Eukaryota; Plantae; Embryobionta; Magnoliophyta; Liliopsida; 
Commelinidae ; Cyperales; Poaceae. 
1 (bases 1 to 4800) 

Kioesgen, R, B . ^ Gierl,A., Schwarz-Sommer, Z. and Saedler,H. 
Molecular analysis of the waxy locus of Zea mays 
Mol. Gen. Genet. 203, 237-244 (1986) 
full automatic 
NCBI gi: 22509 

Location/Qualifiers 

1. .4800 

/organism=" Zea mays" 
283. -287 

/note="direct repeat 1" 
288. .292 

/note="direct repeat 1" 
293. .297 

/note="direct repeat 1" 
298. .302 

/note="direct repeat 1" 
372. .385 

/note="GC stretch (pot. regulatory factor binding 



misc feature 



site) 



misc feature 



site) ' 



misc_feature 

site) " 

misc_f eature 

CAAT_signal 
TATA^^signal 
misc]][f eature 

site) " 

misc_f eature 

exon 



442. .468 

/note="GC stretch (pot. regulatory factor binding 
768. -782 

/note="GC stretch (pot. regulatory factor binding 
810. .822 

/note="GC stretch (pot. regulatory factor binding 
821. .828 

/note="target duplication site (Ac7)" 
821. .828 
867. .873 
887. .900 

/note="GC stretch (pot. regulatory factor binding 
901 

/note="transcriptional start site" 

901. .1080 

/number=l 
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intron 1081 -.12 19 

/number=l 
exon 1220.. 1553 

/number=2 

5 transit peptide 1233.. 1448 

CDS " join( 1449. .1553, 1685. .1765,1860. .1958,2055. .2144, 

2 2 2 6 . . 2 2 89 , 2 4 13 • . 25 13 , 2 65 1 . . 2 7 60 , 2 85 8 . . 3101 , 321 2 . . 3 394 , 

3490. .3681,3793. .3879,3977. .4105,4227. .4343) 
10 /note="NCBI gi: 22510" 

/codon_start=l 

/product="glucosyl transferase" 
/translation=-ASAGMNWFVGAZMAPWSXTGGLGDVLGGLPPAMAANGHRVMW 

15 

SPRYDQYKDAWDTSWSEIKMGDGYSTVRFFHCYXRGVDRVFVDHPLFLERVWGKTEE 
KIYGPVAGTDyRDNQLRFSLLCQAALEAPRILSLNNNPYFSGPYGEDWFVCNDWHTG 
20 PLSCYLKSNYQSHGIYRDAKTAFCIHNISYQGRFAFSDYPELNLPERFKSSFDFIDGY 
EKPVEGRKINWMKAGILEADRVLTVSPYYAEELISGI.^RGCELDNIMRLTGITGIVNG 
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MDVSEWDPSRDKYIAVKYDVSTAVEAKALNKEALQAEVGLPVDRNIPLVAFIGRLEEQ 
KGPDVMAAAIPQLMEMVEDVQIVLLGTGKKKFERMLMSAEEKFPGKVRAWKFNAALA 
HHIMAGADVLAVTSRFEPCGLIQLQGMRYGTPCACASTGGLVDTIIEGKTGFKMGRLS 
VDCNWEPADVXKVATTLQRAIKWGTPAYEEMVRNC.MIQDLSWKGPAXNWENVLLSL 





GVAGGE PGVEGES I APLAKENVAAP " 


intron 


1554. .1684 




/number=2 


exon 


1685- .1765 




/number=3 


intron 


1766. .1859 




/number=3 


exon 


1860. .1958 




/number=4 


intron 


1959. .2054 




/number=4 


exon 


2055. .2144 




/number=5 


intron 


2145. .2225 




/number=5 


exon 


2226. .2289 




/number=6 


intron 


2290. .2412 




/number=6 


exon 


2413. .2513 




/number=7 


intron 


2514. .2650 




/number=7 


exon 


2651. .2760 




/number=8 


intron 


2761- .2857 




/number=8 


exon 


2858. .3101 




/number=9 


intron 


3102. .3211 




/number=9 


exon 


3212. .3394 




/nuniber=10 


misc^feature 


3358. .3365 




/note="target duplication site (Ac9)" 


intron 


3395. .3489 




/number=10 


exon 


3490- .3681 
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/numbers 11 
misc feature 3570.. 3572 

" /note="target duplication site (Spm 18)' 

intron 3682- -3792 

5 /number=ll 
exon 3793.. 3879 

/number=12 
intron 3880.. 3976 

/number=12 

10 exon 3977..410S 

/nuinber=13 
intron 4106 ..4226 

/nuinber=13 
exon 4227.. 4595 

15 /number=14 
polyA_signal 4570. ,4575 
polyA^signal 4593. .4598 
polyA_site 4595 - 

polyA_signal 4597.. 4602 
20 polyA_site 4618 

polyA_site 4625 
BASE COUNT 935 A 1413 C 1447 G 1005 T 

ORIGIN 
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1 


CAGCGACCTA 


TTACACAGCC 


CGCTCGGGCC 


CGCGACGTCG 


GGACACATCT 


TCTTCCCCCT 


61 


TTTGGTGAAG 


CTCTGCTCGC 


AGCTGTCCGG 


CTCCTTGGAC 


GTTCGTGTGG 


CAGATTCATC 


121 


TGTTGTCTCG 


TCTCCTGTGC 


TTCCTGGGTA 


GCTTGTGTAG 


TGGAGCTGAC 


ATGGTCTGAG 


181 


CAGGCTTAAA 


ATTTGCTCGT 


AGACGAGGAG 


TACCAGCACA 


GCACGTTGCG 


GATTTCTCTG 


241 


CCTGTGAAGT 


GCAACGTCTA 


GGATTGTCAC 


ACGCCTTGGT 


CGCGTCGCGT 


CGCGTCGCGT 


301 


CGATGCGGTG 


GTGAGCAGAG 


CAGCAACAGC 


TGGGCGGCCC 


AACGTTGGCT 


TCCGTGTCTT 


361 


CGTCGTACGT 


ACGCGCGCGC 


CGGGGACACG 


CAGCAGAGAG 


CGGAGAGCGA 


GCCGTGCACG 


421 


GGGAGGTGGT 


GTGGAAGTGG 


AGCCGCGCGC 


CCGGCCGCCC 


GCGCCCGGTG 


GGCAACCCAA 


481 


AAGTACCCAC 


GACAAGCGAA 


GGCGCCAAAG 


CGATCCAAGC 


TCCGGAACGC 


AACAGCATGC 


541 


GTCGCGTCGG 


AGAGCCAGCC 


ACAAGCAGCC 


GAGAACCGAA 


CCGGTGGGCG 


ACGCGTCATG 


601 


GGACGGACGC 


GGGCGACGCT 


TCCAAACGGG 


CCACGTACGC 


CGGCGTGTGC 


GTGCGTGCAG 


661 


ACGACAAGCC 


AAGGCGAGGC 


AGCCCCCGAT 


CGGGAAAGCG 


TTTTGGGCGC 


GAGCGCTGGC 


721 


GTGCGGGTCA 


GTCGCTGGTG 


CGCAGTGCCG 


GGGGGAACGG 


GTATCGTGGG 


GGGCGCGGGC 


781 


GGAGGAGAGC 


GTGGCGAGGG 


CCGAGAGCAG 


CGCGCGGCCG 


GGTCACGCAA 


CGCGCCCCAC 


841 


GTACTGCCCT 


CCCCCTCCGC 


GCGCGCTAGA 


AATACCGAGG 


CCTGGACCGG 


GGGGGGGCCC 


901 


CGTCACATCC 


ATCCATCGAC 


CGATCGATCG 


CCACAGCCAA 


CACCACCCGC 


CGAGGCGACG 


961 


CGACAGCCGC 


CAGGAGGAAG 


GAATAAACTC 


ACTGCCAGCC 


AGTGAAGGGG 


GAGAAGTGTA 


1021 


CTGCTCCGTC 


GACCAGTGCG 


CGCACCGCCC 


GGCAGGGCTG 


CTCATCTCGT 


CGACGACCAG 


1081 


GTTCTGTTCC 


GTTCCGATCC 


GATCCGATCC 


TGTCCTTGAG 


TTTCGTCCAG 


ATCCTGGCGC 


1141 


GTATCTGCGT 


GTTTGATGAT 


CCAGGTTCTT 


CGAACCTAAA 


TCTGTCCGTG 


CACACGTCTT 


1201 


TTCTCTCTCT 


CCTACGCAGT 


GGATTAATCG 


GCATGGCGGC 


TCTGGCCACG 


TCGCAGCTCG 


1261 


TCGCAACGCG 


CGCCGGCCTG 


GGCGTCCCGG 


ACGCGTCCAC 


GTTCCGCCGC 


GGCGCCGCGC 
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1321 AGGGCCTGAG GGGGGCCCGG GCGTCGGCGG CGGCGGACAC GCTCAGCATG CGGACCAGCG 
1381 CGCGCGCGGC GCCCAGGCAC CAGCAGCAGG CGCGCCGCGG GGGCAGGTTC CCGTCGCTCG 
1441 TCGTGTGCGC CAGCGCCGGC ATGAACGTCG TCTTCGTCGG CGCCGAGATG GCGCCGTGGA 
1501 GCAAGACCGG CGGCCTCGGC GACGTCCTCG GCGGCCTGCC GCCGGCCATG GCCGTAAGCG 
1561 CGCGCACCGA GACATGCATC CGTTGGATCG CGTCTTCTTC GTGCTCTTGC CGCGTGCATG 
1621 ATGCATGTGT TTCCTCCTGG CTTGTGTTCG TGTATGTGAC GTGTTTGTTC GGGCATGCAT 
1681 GCAGGCGAAC GGGCACCGTG TCATGGTCGT CTCTCCCCGC TACGACCAGT ACAAGGACGC 
1741 CTGGGACACC AGCGTCGTGT CCGAGGTACG GCCACCGAGA CCAGATTCAG ATCACAGTCA 
1801 CACACACCGT CATATGAACC TTTCTCTGCT CTGATGCCTG CAACTGCAAA TGCATGCAGA 
1861 TCAAGATGGG AGACGGGTAC GAGACGGTCA GGTTCTTCCA CTGCTACAAG CGCGGAGTGG 
1921 ACCGCGTGTT CGTTGACCAC CCACTGTTCC TGGAGAGGGT GAGACGAGAT CTGATCACTC 
1981 GATACGCAAT TACCACCCCA TTGTAAGCAG TTACAGTGAG CTTTTTTTCC CCCCGGCCTG 
2041 GTCGCTGGTT TCAGGTTTGG GGAAAGACCG AGGAGAAGAT CTACGGGCCT GTCGCTGGAA 
2101 CGGACTACAG GGACAACCAG CTGCGGTTCA GCCTGCTATG CCAGGTCAGG ATGGCTTGGT 
2161 ACTACAACTT CATATCATCT GTATGCAGCA GTATACACTG ATGAGAAATG CATGCTGTTC 
2221 TGCAGGCAGC ACTTGAAGCT CCAAGGATCC TGAGCCTCAA CAACAACCCA TACTTCTCCG 
2281 GACCATACGG TAAGAGTTGC AGTCTTCGTA TATATATCTG TTGAGCTCGA GAATCTTCAC 
2341 AGGAAGCGGC CCATCAGACG GACTGTCATT TTACACTGAC TACTGCTGCT GCTCTTCGTC 
2401 CATCCATACA AGGGGAGGAC GTCGTGTTCG TCTGCAACGA CTGGGACACC GGCCCTCTCT 
2461 CGTGCTACCT CAAGAGCAAC TACCAGTCCC ACGGCATCTA CAGGGACGCA AAGGTTGCCT 
2521 TCTCTGAACT GAACAACGCC GTTTTCGTTC TCCATGCTCG TATATACCTC GTCTGGTAGT 
2581 GGTGGTGCTT CTCTGAGAAA CTAACTGAAA CTGACTGCAT GTCTGTCTGA CCATCTTCAC 
2641 GTACTACCAG ACCGCTTTCT GCATCCACAA CATCTCCTAC CAGGGCCGGT TCGCCTTCTC 
2701 CGACTACCCG GAGCTGAACC TCCCGGAGAG ATTCAAGTCG TCCTTCGATT TCATCGACGG 
2761 GTCTGTTTTC CTGCGTGCAT GTGAACATTC ATGAATGGTA ACCCACAACT GTTCGCGTCC 
2821 TGCTGGTTCA TTATCTGACC TGATTGCATT ATTGCAGCTA CGAGAAGCCC GTGGAAGGCC 
2881 GGAAGATCAA CTGGATGAAG GCCGGGATCC TCGAGGCCGA CAGGGTCCTC ACCGTCAGCC 
2941 CCTACTACGC CGAGGAGCTC ATCTCCGGCA TCGCCAGGGG CTGCGAGCTC GACAACATCA 
3001 TGCGCCTCAC CGGCATCACC GGCATCGTCA ACGGCATGGA CGTCAGCGAG TGGGACCCCA 
3061 GCAGGGACAA GTACATCGCC GTGAAGTACG ACGTGTCGAC GGTGAGCTGG CTAGCTCTGA 
3121 TTCTGCTGCC TGGTCCTCCT GCTCATCATG CTGGTTCGGT ACTGACGCGG CAAGTGTACG 
3181 TACGTGCGTG CGACGGTGGT GTCCGGTTCA GGCCGTGGAG GCCAAGGCGC TGAACAAGGA 
3241 GGCGCTGCAG GCGGAGGTCG GGCTCCCGGT GGACCGGAAC ATCCCGCTGG TGGCGTTCAT 
3301 CGGCAGGCTG GAAGAGCAGA AGGGCCCCGA CGTCATGGCG GCCGCCATCC CGCAGCTCAT 
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3361 GGAGATGGTG GAGGACGTGC AGATCGTTCT GCTGGTACGT GTGCGCCGGC CGCCACCCGG 
3421 CTACTACATG CGTGTATCGT TCGTTCTACT GGAACATGCG TGTGAGCAAC GCGATGGATA 
3481 ATGCTGCAGG GCACGGGCAA GAAGAAGTTC GAGCGCATGC TCATGAGCGC CGAGGAGAAG 
3541 TTCCCAGGCA AGGTGCGCGC CGTGGTCAAG TTCAACGCGG CGCTGGCGCA CCACATCATG 
3601 GCCGGCGCCG ACGTGCTCGC CGTCACCAGC CGCTTCGAGC CCTGCGGCCT CATCCAGCTG 
3661 CAGGGGATGC GATACGGAAC GGTACGAGAG AAAAAAAAAA TCCTGAATCC TGACGAGAGG 
3721 GACAGAGACA GATTATGAAT GCTTCATCGA TTTGAATTGA TTGATCGATG TCTCCCGCTG 
3781 CGACTCTTGC AGCCCTGCGC CTGCGCGTCC ACCGGTGGAC TCGTCGACAC CATCATCGAA 
3841 GGCAAGACCG GGTTCCACAT GGGCCGCCTC AGCGTCGACG TAAGCCTAGC TCTGCCATGT 
3901 TCTTTCTTCT TTCTTTCTGT ATGTATGTAT GAATCAGCAC CGCCGTTCTT GTTTCGTCGT 
3961 CGTCCTCTCT TCCCAGTGTA ACGTCGTGGA GCCGGCGGAC GTCAAGAAGG TGGCCACCAC 
4021 ATTGCAGCGC GCCATCAAGG TGGTCGGCAC GCCGGCGTAC GAGGAGATGG TGAGGAACTG 
4081 CATGATCCAG GATCTCTCCT GGAAGGTACG TACGCCCGCC CCGCCCCGCC CCGCCAGAGC 
4141 AGAGCGCCAA GATCGACCGA TCGACCGACC ACACGTACGC GCCTCGCTCC TGTCGCTGAC 
4201 CGTGGTTTAA TTTGCGAAAT GCGCAGGGCC CTGCCAAGAA CTGGGAGAAC GTGCTGCTCA 
4261 GCCTCGGGGT CGCCGGCGGC GAGCCAGGGG TCGAAGGCGA GGAGATCGCG CCGCTCGCCA 
4321 AGGAGAACGT GGCCGCGCCC TGAAGAGTTC GGCCTGCAGG GCCCCTGATC TCGCGCGTGG 
4381 TGCAAAGATG TTGGGACATC TTCTTATATA TGCTGTTTCG TTTATGTGAT ATGGACAAGT 
4441 ATGTGTAGCT GCTTGCTTGT GCTAGTGTAA TGTAGTGTAG TGGTGGCCAG TGGCACAACC 
4501 TAATAAGCGC ATGAACTAAT TGCTTGCGTG TGTAGTTAAG TACCGATCGG TAATTTTATA 
4561 TTGCGAGTAA ATAAATGGAC CTGTAGTGGT GGAGTAAATA ATCCCTGCTG TTCGGTGTTC 
4621 TTATCGCTCC TCGTATAGAT ATTATATAGA GTACATTTTT CTCTCTCTGA ATCCTACGTT 
4631 TGTGAAATTT CTATATCATT ACTGTAAAAT TTCTGCGTTC CAAAAGAGAC CATAGCCTAT 
4741 CTTTGGCCCT GTTTGTTTCG GCTTCTGGCA GCTTCTGGCC ACCAAAAGCT GCTGCGGACT 
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TABLE lb 

DNA Sequence and Deduced Amino Acid Segiience in warv Gene in Rice 
rSEO ID N0:6 and SEP ID N0:7] 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 
SOURCE 

ORGANISM 



RNA 



PLN 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

R.J. 



STANDARD 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
STANDARD 
COMMENT 
FEATURES 

source 



CDS 



OSWX 2542 bp 

O.sativa Waxy mRNA. 
X62134 S39554 

glucosyltransferase; starch biosynthesis; waxy gene, 
rice. 

Oryza sativa 

Eukaryota; Plantae; Embryobionta; Magnoliophyta; Liliopsida; 
Conunelinidae; Cyperales; Poaceae. 

1 (bases 1 to 2542) 
Okay aki, R.J. 
Direct Submission 

Submitted ( 12-SEP-1991), to the EMBL/GenBank/DDBJ databases. 

Okayaki, University of Florida^ Dep of Vegetable Crops, 1255 
Fifield Hall, 514 IFAS, Gainesville, Florida 32611-0514, USA 
full automatic 

2 (bases 1 to 2542) 
0kagaki,R. J. 

Nucleotide sequence of a long cDNA from the rice waxy gene 
Plant Mol. Biol. 19, 513-516 (1992) 
full automatic 
NCBI gi: 20402 

Location/Qualifiers 
1. .2542 

/organism= "Oryza sativa" 
/dev_stage=" immature seed" 
/tissue_type="seed" 
453.-2232 

/gene="Wx " 

/standard_name="Waxy gene" 
/EC_number="2.4. 1.21" 
/note="NCBI gi: 20403" 
/codon_start=l 

/function="starch biosynthesis" 
/product="starch (bacterial glycogen) synthase" 

/translation="MSALTTSQLATSATGFGIADRSAPSSLLRHGFQGLKPRSPAGGD 

ATSLSVTTSARATPKQQRSVQRGSRRFPSVWYATGAGMNWFVGAEMAPWSKTGGLG 

DVLGGLPPAMAANGHRVMVISPRyDQYKDAWDTSWAEIKVADRYERVRFFHCYKRGV 

DRVFIDHPSFLEKVWGKTGEKIYGPDTGVDYKDNQMRFSLLCQAALEAPRILNLNNNP 

YFKGTYGEDWFVCNDWHTGPLASYLKNNYQPNGIYRNAKVAFCIHNISYQGRFAFED 

YPELNLSERFRSSFDFIDGYDTPVEGRXINWMKAGILEADRVLTVSPYYAEELISGIA 

RGCELDNIMRLTGITGIVNGMDVSEWDPSKDKYITAXYDATTAIEAKALNKSALQAEA 

GLPVDRKIPLIAFIGRLEEQKGPDVMAAAIPELMQEDVQIVLLGTGKKKFEKLLKSME 

EKYPGKVRAWKFNAPLAHLIMAGADVLAVPSRFEPCGLIQLQGMRYGTPCACASTGG 

LVDTVIEGKTGFHMGRLSVDCKWEPSDVKKVAATLKRAIKWGTPAYEEMVRNCMNQ 

DLSWKGPAKNWENVLLGLGVAGSAPGIEGDEIAPLAKSNVAAP" 

3'UTR 2283.. 2535 

polyA_site 2535 
BASE COUNT 610 A 665 C 693 G 574 T 

ORIGIN 

1 GAATTCAGTG TGAAGGAATA GATTCTCTTC AAAACAATTT AATCATTCAT CTGATCTGCT 
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61 CAAAGCTCTG TGCATCTCCG GGTGCAACGG CCAGGATATT TATTGTGCAG TAAAAAAATG 
121 TCATATCCCC TAGCCACCCA AGAAACTGCT CCTTAAGTCC TTATAAGCAC ATATGGCATT 
181 GTAATATATA TGTTTGAGTT TTAGCGACAA TTTTTTTAAA AACTTTTGGT CCTTTTTATG 
241 AACGXTTTAA GTTTCACTGT CTTTTTTTTT CGAATTTTAA ATGTAGCTTC AAATTCTAAT 
301 CCCCAATCCA AATTGTAATA AACTTCAATT CTCCTAATTA ACATCTTAAT TCATTTATTT 
361 GAAAACCAGT TCAAATTCTT TTTAGGCTCA CCAAACCTTA AACAATTCAA TTCAGTGCAG 
421 AGATCTTCCA CAGCAACAGC TAGACAACCA CCATGTCGGC TCTCACCACG TCCCAGCTCG 
481 CCACCTCGGC CACCGGCTTC GGCATCGCCG ACAGGTCGGC GCCGTCGTCG CTGCTCCGCC 
541 ACGGGTTCCA GGGCCTCAAG CCCCGCAGCC CCGCCGGCGG CGACGCGACG TCGCTCAGCG 
601 TGACGACCAG CGCGCGCGCG ACGCCCAAGC AGCAGCGGTC GGTGCAGCGT GGCAGCCGGA 
661 GGTTCCCCTC CGTCGTCGTG TACGCCACCG GCGCCGGCAT GAACGTCGTG TTCGTCGGCG 
721 CCGAGATGGC CCCCTGGAGC AAGACCGGCG GCCTCGGTGA CGTCCTCGGT GGCCTCCCCC 
781 CTGCCATGGC TGCGAATGGC CACAGGGTCA TGGTGATCTC TCCTCGGTAC GACCAGTACA 
841 AGGACGCTTG GG AT ACCAGC GTTGTGGCTG AGATCAAGGT TGCAGACAGG TACGAGAGGG 
901 TGAGGTTTTT CCATTGCTAC AAGCGTGGAG TCGACCGTGT GTTCATCGAC CATCCGTCAT 
961 TCCTGGAGAA GGTTTGGGGA AAGACCGGTG AGAAGATCTA CGGACCTGAC ACTGGAGTTG 
1021 ATTACAAAGA CAACCAGATG CGTTTCAGCC TTCTTTGCCA GGCAGCACTC GAGGCTCCTA 
1081 GGATCCTAAA CCTCAACAAC AACCCATACT TCAAAGGAAC TTATGGTGAG GATGTTGTGT 
1141 TCGTCTGCAA CGACTGGCAC ACTGGCCCAC TGGCGAGCTA CCTGAAGAAC AACTACCAGC 
1201 CCAATGGCAT CTACAGGAAT GCAAAGGTTG CTTTCTGCAT CCACAACATC TCCTACCAGG 
1261 GCCGTTTCGC TTTCGAGGAT TACCCTGAGC TGAACCTCTC CGAGAGGTTC AGGTCATCCT 
1321 TCGATTTCAT CGACGGGTAT GACACGCCGG TGGAGGGCAG GAAGATCAAC TGGATGAAGG 
1381 CCGGAATCCT GGAAGCCGAC AGGGTGCTCA CCGTGAGCCC GTACTACGCC GAGGAGCTCA 
1441 TCTCCGGCAT CGCCAGGGGA TGCGAGCTCG ACAACATCAT GCGGCTCACC GGCATCACCG 
1501 GCATCGTCAA CGGCATGGAC GTCAGCGAGT GGGATCCTAG CAAGGACAAG TACATCACCG 
1561 CCAAGTACGA CGCAACCACG GCAATCGAGG CGAAGGCGCT GAACAAGGAG GCGTTGCAGG 
1621 CGGAGGCGGG TCTTCCGGTC GACAGGAAAA TCCCACTGAT CGCGTTCATC GGCAGGCTGG 
1681 AGGAACAGAA GGGCCCTGAC GTCATGGCCG CCGCCATCCC GGAGCTCATG CAGGAGGACG 
1741 TCCAGATCGT TCTTCTGGGT ACTGGAAAGA AGAAGTTCGA GAAGCTGCTC AAGAGCATGG 
1801 AGGAGAAGTA TCCGGGCAAG GTGAGGGCGG TGGTGAAGTT CAACGCGCCG CTTGCTCATC 
1861 TCATCATGGC CGGAGCCGAC GTGCTCGCCG TCCCCAGCCG CTTCGAGCCC TGTGGACTCA 
1921 TCCAGCTGCA GGGGATGAGA TACGGAACGC CCTGTGCTTG CGCGTCCACC GGTGGGCTCG 
1981 TGGACACGGT CATCGAAGGC AAGACTGGTT TCCACATGGG CCGTCTCAGC GTCGACTGCA 
2041 AGGTGGTGGA GCCAAGCGAC GTGAAGAAGG TGGCGGCCAC CCTGAAGCGC GCCATCAAGG 
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2101 


TPGTCGGCAC 


GCCGGCGTAC 


GAGGAGATGG 


TCAGGAACTG 


CATGAACCAG 


GAPr'TPTOnT 

UnVa* X X ^a^W X 


^ X w X 




TGCGAAGAAC 


TGGGAGAATG 


1 wV* X v«V« JL wvVy 






2221 


CGCCGGGGAT 


CGAAGGCGAC 


GAGATCGCGC 


CGCTCGCCAA 

www X Ww%^WfU\ 


CGAGAACGTG 
wnvjnn w w x w 


X V9V« X V*Vi* X X 


2281 


GAAGAGCCTG 


AGATCTACAT 


ATGGAGTGAT 


TAATTAATAT 

X X x«m xn X 


AGCAGTATAT 

nw X X n X 


GGATGAGACA 


2341 


CGAATGAACC 


AGTGGTTTGT 


TTGTTGTAGT 


GAATTTGTAG 


CTATAGCCAA 


TTATATAGGC 


2401 


TAATAAGTTT 


GATGTTGTAC 


TCTTCTGGGT 


GTGCTTAAGT 


ATCTTATCGG 


ACCCTGAATT 


2461 


TATGTGTGTG 


GCTTATTGCC 


AATAATATTA 


AGTAATAAAG 


GGTTTATTAT 


ATTATTATAT 


2521 


ATGTTATATT 


ATACTAAAAA 


AA 









// 

TABLE 2 

DNA Sequence and Deduced Amino Acid Sequence of 
the Soluble Starch Synthase ITa Gene in Maize 
rSEQ ID N0:8 and SEP ID N0:91 

FILE NAME : MSS2C.SEQ SEQUENCE : NORMAL 2007 BP 

CODON TABLE : UNIV. TON 

SEQUENCE REGION : 1 - 2007 

TRANSLATION REGION : 1 - 2007 

*** DNA TRANSLATION *** 

1 GCT GAG GOT GAG GCC GGG GGC AAG GAC GCG COG COG GAG AGG AGC GGC 48 
lAEAEAGGKDAPPERSG 16 

49 GAC GCC GCC AGG TTG CCC CGC GCT CGG CGC AAT GCG GTC TCC AAA CGG 96 
17DAARLPRARRNAVSKR 32 

97 AGG GAT CCT CTT CAG CCG GTC GGC CGG TAG GGC TCC GCG ACG GGA AAC 144 
33RDPLQPVGRYGSATGN 48 

145 ACG GCC AGG ACC GGC GCC GCG TCC TGC CAG AAC GCC GCA TTG GCG GAC 192 
49TARTGAASCQNAALAD 64 

193 GTT GAG ATC GTT GAG ATC AAG TCC ATC GTC GCC GCG CCG CCG ACG AGC 240 
65VEIVEIKSIVAAPPTS 80 

241 ATA GTG AAG TTC CCA GGG CGC GGG CTA CAG GAT GAT CCT TCC CTC TGG 288 
81IVKFPGRGLQDDPSLW 96 

289 GAC ATA GCA CCG GAG ACT GTC CTC CCA GCC CCG AAG CCA CTG CAT GAA 336 
97DIAPETVLPAPKPLHE 112 

337 TCG CCT GCG GTT GAC GGA GAT TCA AAT GGA ATT GCA CCT CCT ACA GTT 384 
113 SPAVDGDSNGIAPPTV 128 

385 GAG CCA TTA GTA CAG GAG GCC ACT TGG GAT TTC AAG AAA TAG ATC GGT 432 
129 EPLVQEATWDFKKYIG 144 

433 TTT GAC GAG CCT GAC GAA GCG AAG GAT GAT TCC AGG GTT GGT GCA GAT 480 
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10 



15 



20 



25 



30 



35 



40 



45 



145 

481 
161 

529 
177 

577 
193 

625 
209 

673 
225 

721 
241 

769 
257 

817 
273 

865 
289 

913 
305 

961 
321 

1009 
337 

1057 
353 

1105 
369 

1153 
385 

1201 
401 

1249 
417 

1297 
433 

1345 
449 

1393 
465 

1441 
481 

1489 
497 



D 
A 



M N 



Y 

MC 
I 



M 



I 
A 



G 

3GC 
G 



W 



M 



M 



V 
I 



P 

rAc 

Y 



Q 
F 



L 

CTC 
F 



H 



M 



W 



H 
I 



M 



N W 



L 

Q 
T 



M 



L 

CTC 
F 



M 



I 

DTC 
L 



N 



A 



W 
I 



G 
R 



H 



D 
I 



A 

CAC 
Y 

VTC 
I 



M 



W 



W 



Q 

T 



W 



V 



K 



V 



K 

:cG 
p 



Q 

CTC 
F 



M 



W 



D 


.160 


TGT 


528 


c 


176 


CCA 


576 


P 


192 


AAG 


624 


K 


208 


TAT 


672 


Y 


224 


AAA 


720 


K 


240 


GAT 


768 


D 


256 


GAT 


816 


D 


272 


TTG 


864 


L 


288 


GGT 


912 


G 


304 


CAC 


960 


H 


320 


GGG 


1008 


G 


336 


CAC 


1056 


H 


352 


AAC 


1104 


N 


368 


CAC 


1152 


H 


384 


GTG 


1200 


V 


400 


GGC 


1248 


G 


416 


GGC 


1296 


G 


432 


GTG 


1344 


V 


448 


GAC 


1392 


D 


464 


CTG 


1440 


L 


480 


GAT 


1488 


D 


496 


GCG 


1536 


A 


512 



35 



1537 GGG CAG GAG GTG CAG CTG GTG ATG CTG GGC ACC GGC CCA CCT GAC CTG 1584 

513GQDVQLVMLGTGP'pdL 528 

1585 GAA CGA ATG CTG CAG CAC TTG GAG CGG GAG CAT CCC AAC AAG GTG CGC 1632 

529 ERMLQHLEREHPNK VR 544 

1633 GGG TGG GTC GGG TTC TCG GTC CTA ATG GTG CAT CGC ATC ACG CCG GGC 1680 

545 G W V G F S V L M V H R I T P G 560 

1681 GCC AGC GTG CTG GTG ATG CCC TCC CGC TTC GCC GGC GGG CTG AAC CAG 1728 

561ASVLVMPSRFAGGLNQ 576 

1729 CTC TAG GCG ATG GCA TAC GGC ACC GTC CCT GTG GTG CAC GCC GTG GGC 1776 

577 LYAMAYGTVPVVHAVG 592 

1777 GGG CTC AGG GAC ACC GTG GCG CCG TTC GAC CCG TTC GGC GAC GCC GGG 1824 

553GLRDTVA^PFDPFGDAG 608 

1825 CTC GGG TGG ACT TTT GAC CGC GCC GAG GCC AAC AAG CTG ATC GAG GTG 1872 

609LGWTFDRAEANKLIE V 624 

1873 CTC AGC CAC TGC CTC GAC ACG TAC CGA AAC TAC GAG GAG AGC TGG AAG 1920 

625 LSHCLDTYRNYEESWK 640 

1921 AGT CTC CAG GCG CGC GGC ATG TCG CAG AAC CTC AGC TGG GAC CAC GCG 1968 

641SLQARGMSQNLSWDHA 656 

1969 GCT GAG CTC TAC GAG GAC GTC CTT GTC AAG TAC CAG TGG 2007 

657AELYEDVLVKyQW 669 



TABLE 3 

DNA Sequence and Deduced Amino Acid Sequence of 
The Soluble Starch Synthase lib Gene in Maize 
rSEO ID NO: 10 and SEP ID NO: 1 1] 



FILE NAME : MSS3FULL.DNA SEQUENCE : NORMAL 2097 BP 

CODON TABLE : UNIV^TCN 

SEQUENCE REGION : 1 - 2097 

TRANSLATION REGION : 1 - 2097 



*** DNA TRANSLATION *** 



30 


1 
1 


ATG 
M 


CCG 
P 


GGG 
G 


GCA 
A 


ATC 
I 


TCT 
S 


TCC 
S 


TCG 
S 


TCG 
S 


TCG 
S 


GCT 
A 


TTT 
F 


CTC 
L 


CTC 
L 


CCC 

P 


GTC 
V 


48 
16 




49 
17 


GCG 
A 


TCC 

s 


TCC 
S 


TCG 
S 


CCG 
P 


CGG 
R 


CGC 
R 


AGG 
R 


CGG 
R 


GGC 
G 


AGT 
S 


GTG 
V 


GGT 
G 


GCT 
A 


GCT 
A 


CTG 
L 


96 
32 


35 


97 
33 


CGC 
R 


TCG 
S 


TAC 
Y 


GGC 
G 


TAC 
Y 


AGC 
S 


GGC 
G 


GCG 
A 


GAG 
E 


CTG 
L 


CGG 
R 


TTG 
L 


CAT 
H 


TGG 
W 


GCG 
A 


CGG 
R 


144 
48 




145 
49 


CGG 
R 


GGC 
G 


CCG 
P 


CCT 
P 


CAG 
Q 


GAT 
D 


GGA 
G 


GCG 
A 


GCG 
A 


TCG 
S 


GTA 
V 


CGC 
R 


GCC 
A 


GCA 
A 


GCG 
A 


GCA 
A 


192 
64 




193 
65 


CCG 
P 


GCC 
A 


GGG 
G 


GGC 
G 


GAA 
E 


AGC 
S 


GAG 
E 


GAG 
E 


GCA 
A 


GCG 
A 


AAG 
K 


AGC 
S 


TCC 
S 


TCC 
S 


TCG 
S 


TCC 
S 


240 
80 


40 


241 


CAG 


GCG 


GGC 


GCT 


GTT 


CAG 


GGC 


AGC 


ACG 


GCC 


AAG 


GCT 


GTG 


GAT 


TCT 


GCT 


288 



36 



81 



QAGAVQGSTAKAVDSA 



96 





289 
97 


TCA 
S 


CCT 
P 


CCC 

p 


AAT 
N 


CCT 
P 


TTG 
L 


ACA 
T 


TCT 
S 


GCT 
A 


CCG 
P 


AAG 
K 


CAA 
Q 


AGT 
S 


CAG 
Q 


AGC 
S 


GCT 
A 


336 
112 


5 


337 
113 


GCA 
A 


ATG 
M 


CAA 

Q 


AAC 
N 


GGA 
G 


ACG 
T 


AGT 
S 


GGG 
G 


GGC 
G 


AGC 
S 


AGC 
S 


GCG 
A 


AGC 
S 


ACC 
T 


GCC 
A 


GCG 
A 


384 
128 




385 
129 


CCG 
P 


GTG 
V 


TCC 
S 


GGA 
G 


CCC 

p 


AAA 
K 


GCT 
A 


GAT 
D 


CAT 
H 


CCA 
P 


TCA 
S 


GCT 
A 


CCT 
P 


GTC 
V 


ACC 
T 


AAG 
K 


432 
144 




433 
145 


AGA 
R 


GAA 
E 


ATC 
I 


GAT 
D 


GCC 
A 


AGT 
S 


GCG 
A 


GTG 
V 


AAG 
K 


CCA 
P 


GAG 
£ 


CCC 
P 


GCA 
A 


GGT 
G 


GAT 
D 


GAT 
D 


480 
160 


10 


481 
161 


GCT 
A 


AGA 
R 


CCG 
P 


GTG 
V 


GAA 
E 


AGC 
S 


ATA 
I 


GGC 
G 


ATC 
I 


GCT 
A 


GAA 
E 


CCG 
P 


GTG 
V 


GAT 
D 


GCT 
A 


AAG 
K 


528 
176 




529 
177 


GCT 
A 


GAT 
D 


GCA 
A 


GCT 
A 


CCG 
P 


GCT 
A 


ACA 
T 


ga't 

D 


GCG 
A 


GCG 
A 


GCG 
A 


AGT 
S 


GCT 
A 


CCT 
P 


TAT 
Y 


GAC 
D 


576 
192 


15 


577 
193 


AGG 
R 


GAG 
£ 


GAT 
D 


AAT 
N 


GAA 
E 


CCT 
P 


GGC 
G 


CCT 
P 


TTG 
L 


GCT 
A 


GGG 
G 


CCT 
P 


AAT 
N 


GTG 
V 


ATG 
M 


AAC 
N 


624 
208 




625 
209 


GTC 
V 


GTC 
V 


GTG 
V 


GTG 
V 


GCT 
A 


TCT 
S 


GAA 
E 


tgt 
c 


GCT 
A 


CCT 
P 


TTC 
F 


TGC 
C 


AAG 
K 


ACA 
T 


GGT 
G 


GGC 
G 


672 
224 




673 
225 


CTT 
L 


GGA 
G 


GAT 
D 


GTC 
V 


GTG 
V 


GGT 
G 


GCT 
A 


TTG 
L 


CCT 
p 


AAG 
K 


GCT 
A 


CTG 
L 


GCG 
A 


AGG 
R 


AGA 
R 


GGA 
G 


720 
240 


20 


721 
241 


CAC 
H 


CGT 
R 


GTT 
V 


ATG 
M 


GTC 
V 


GTG 
V 


ATA 
I 


CCA 
P 


AGA 
R 


TAT 
Y 


GGA 
G 


GAG 
E 


TAT 

y 


GCC 
A 


GAA 
E 


GCC 
A 


768 
256 




769 
257 


CGG 
R 


GAT 
D 


TTA 
L 


GGT 
G 


GTA 
V 


AGG 
R 


AGA 
R 


CGT 
R 


TAC 
Y 


AAG 
X 


GTA 
V 


GCT 
A 


GGA 
G 


CAG 
Q 


GAT 
D 


TCA 
S 


816 
272 


25 


817 
273 


GAA 
E 


GTT 
V 


ACT 
T 


TAT 

y 


TTT 
F 


CAC 
H 


TCT 
S 


TAC 

y 


ATT 
I 


GAT 
D 


GGA 
G 


GTT 
V 


GAT 
D 


TTT 
F 


GTA 
V 


TTC 
F 


864 
288 




865 
289 


GTA 
V 


GAA 
E 


GCC 
A 


CCT 
P 


CCC 

p 


TTC 
F 


CGG 
R 


CAC 
H 


CGG 
R 


CAC 
H 


AAT 
N 


AAT 
N 


ATT 
I 


TAT 
Y 


GGG 
G 


GGA 
G 


912 
304 




913 
305 


GAA 
E 


AGA 
R 


TTG 
L 


GAT 
D 


ATT 
I 


TTG 
L 


AAG 
K 


CGC 
R 


ATG 
M 


ATT 
I 


TTG 
L 


TTC 
F 


TGC 
C 


AAG 
K 


GCC 
A 


GCT 
A 


960 
320 


30 


961 
321 


GTT 
V 


GAG 
E 


GTT 
V 


CCA 
P 


TGG 
W 


TAT 
Y 


GCT 
A 


CCA 
P 


TGT 
C 


GGC 
G 


GGT 
G 


ACT 
T 


GTC 
V 


TAT 
Y 


GGT 
G 


GAT 
D 


1008 
336 




1009 
337 


GGC 
G 


AAC 
N 


TTA 
L 


GTT 
V 


TTC 
F 


ATT 
I 


GCT 
A 


AAT 
N 


GAT 
D 


TGG 
W 


CAT 
H 


ACC 
T 


GCA 
A 


CTT 
L 


CTG 
L 


CCT 
P 


1056 
352 


35 


1057 
353 


GTC 
V 


TAT 
Y 


CTA 
L 


AAG 
K 


GCC 
A 


TAT 
Y 


TAC 
Y 


CGG 
R 


GAC 
D 


AAT 
N 


GGT 
G 


TTG 
L 


ATG 
M 


CAG 

Q 


TAT 
Y 


GCT 
A 


1104 
368 




1105 
369 


CGG 
R 


TCT 
S 


GTG 
V 


CTT 
L 


GTG 
V 


ATA 
I 


CAC 
H 


AAC 
N 


ATT 
I 


GCT 
A 


CAT 
H 


CAG 
Q 


GGT 
G 


CGT 
R 


GGC 
G 


CCT 
P 


1152 
384 




1153 
385 


GTA 
V 


GAC 
D 


GAC 
D 


TTC 
F 


GTC 
V 


AAT 
N 


TTT 
F 


GAC 
D 


TTG 
L 


CCT 
P 


GAA 
E 


CAC 
H 


TAC 
Y 


ATC 
I 


GAC 
D 


CAC 
H 


1200 
400 


40 


1201 
401 


TTC 
F 


AAA 
K 


CTG 
L 


TAT 
Y 


GAC 
D 


AAC 
N 


ATT 
I 


GGT 
G 


GGG 
G 


GAT 
D 


CAC 
H 


AGC 
S 


AAC 
N 


GTT 
V 


TTT 
F 


GCT 
A 


1248 
416 




1249 
417 


GCG 
A 


GGG 
G 


CTG 
L 


AAG 
K 


ACG 
T 


GCA 
A 


GAC 
0 


CGG 
R 


GTG 
V 


GTG 
V 


ACC 
T 


GTT 
V 


AGC 
S 


AAT 
N 


GGC 
G 


TAC 
Y 


1296 
432 


45 


1297 
433 


ATG 
M 


TGG 
W 


GAG 
E 


CTG 
L 


AAG 
K 


ACT 
T 


TCG 
S 


GAA 
£ 


GGC 
G 


GGG 
G 


TGG 
W 


GGC 
G 


CTC 
L 


CAC 
H 


GAC 
0 


ATC 
I 


1344 
448 



37 



1345 
449 


ATA 
I 


AAC 
N 


CAG 
Q 


AAC 
N 


GAC 
D 


TGG 
W 


AAG 
K 


CTG 
L 


CAG 
Q 


GGC 
G 


ATC 
I 


GTG 
V 


AAC 
N 


GGC 
G 


ATC 
I 


GAC 
D 


1392 
464 


1393 
465 


ATG 
M 


AGC 
S 


GAG 
£ 


TCG 
W 


AAC 
N 


CCC 
P 


GCT 
A 


GTG 
V 


GAC 
D 


GTG 
V 


CAC 
H 


CTC 
L 


CAC 
H 


TCC 
S 


GAC 
D 


GAC 
D 


1440 
480 


1441 
481 


TAG 
Y 


ACC 
T 


AAC 
N 


TAC 
Y 


ACG 
T 


TTC 
F 


GAG 
E 


ACG 
T 


CTG 
L 


GAC 
D 


ACC 
T 


GGC 
G 


AAG 
K 


CGG 
R 


CAG 
Q 


TGC 
C 


1488 
496 


1489 
497 


AAG 
K 


GCC 
A 


GCC 
A 


CTG 
L 


CAG 
Q 


CGG 
R 


CAG 
Q 


CTG 
L 


GGC 
G 


CTG 
L 


CAG 
Q 


GTC 
V 


CGC 
R 


GAC 
D 


GAC 
D 


GTG 
V 


1536 
512 


1537 
513 


CCA 
P 


CTG 
L 


ATC 
I 


GGG 
6 


TTC 
F 


ATC 
I 


GGG 
G 


CGG 
R 


CTG 
L 


GAC 
D 


CAC 
H 


CAG 
Q 


AAG 
K 


GGC 
G 


GTG 
V 


GAC 
D 


1584 
528 


1585 
529 


ATC 
I 


ATC 
I 


GCC 
A 


GAC 
D 


GCG 
A 


ATC 
I 


CAC 
H 


TGG 
W 


ATC 
I 


GCG 
A 


GGG 
G 


CAG 
Q 


GAC 
D 


GTG 
V 


CAG 
Q 


CTC 
L 


632 
544 


1633 
545 


GTG 
V 


ATG 
M 


CTG 
L 


GGC 
G 


ACC 
T 


GGG 
G 


CGG 
R 


GCC 
A 


GAC 
D 


CTG 
L 


GAG 
£ 


GAC 
D 


ATG 
M 


CTG 
L 


CGG 
R 


CGG 
R 


1680 
560 


1681 
561 


TTC 
F 


GAG 
E 


TCG 
S 


GAG 
E 


CAC 
H 


AGC 
S 


GAC 
D 


AAG 
K 


GTG 
V 


CGC 
R 


GCG 
A 


TGG 
W 


GTG 
V 


GGG 
G 


TTC 
F 


TCG 
S 


1728 
576 


1729 
577 


GTG 
V 


ccc 
p 


CTG 
L 


GCG 
A 


CAC 
H 


CGC 
R 


ATC 
I 


ACG 
T 


GCG 
A 


GGC 
G 


GCG 
A 


GAC 
D 


ATC 
I 


CTG 
L 


CTG 
L 


ATG 
M 


1776 
592 


1777 
593 


CCG 
P 


TCG 
S 


CGG 
R 


TTC 
F 


GAG 
£ 


CCG 
P 


TGC 
C 


GGG 
G 


CTG 
L 


AAC 
N 


CAG 
Q 


CTC 
L 


TAC 
Y 


GCC 
A 


ATG 
M 


GCG 
A 


1824 
608 


1825 
609 


TAC 
Y 


GGG 
G 


ACC 
T 


GTG 
V 


CCC 
P 


GTG 
V 


GTG 
V 


CAC 
H 


GCC 
A 


GTG 
V 


GGG 
G 


GGG 
G 


CTC 
L 


CGG 
R 


GAC 
D 


ACG 
T 


1872 
624 


1873 
625 


GTG 
V 


GCG 
A 


CCG 
P 


TTC 
F 


GAC 
D 


CCG 
P 


TTC 
F 


AAC 
N 


GAC 
D 


ACC 
T 


GGG 
G 


CTC 
L 


GGG 
G 


TGG 
W 


ACG 
T 


TTC 
F 


1920 
640 


1921 
641 


GAC 
D 


CGC 
R 


GCG 
A 


GAG 
E 


GCG 
A 


AAC 
N 


CGG 
R 


ATG 
M 


ATC 
I 


GAC 
0 


GCG 
A 


CTC 
L 


TCG 
S 


CAC 
H 


TGC 
C 


CTC 
L 


1968 
656 


1969 
657 


ACC 
T 


ACG 
T 


TAC 
Y 


CGG 

R 


AAC 
N 


TAC 


AAG 
K 


GAG 


AGC 
c 

o 


TGG 

M 
rl 


CGC 


GCC 

TV 

A 


TGC 


AGG 

K 


GCG 
A 


CGC 
R 


2016 

672 


2017 
673 


GGC 
G 


ATG 
M 


GCC 
A 


GAG 
E 


GAC 
D 


CTC 
L 


AGC 
S 


TGG 
W 


GAC 
0 


CAC 
H 


GCC 
A 


GCC 
A 


GTG 
V 


CTG 
L 


TAT 
Y 


GAG 
E 


2064 
688 


2065 
689 


GAC 
D 


GTG 
V 


CTC 
L 


GTC 
V 


AAG 
K 


GCG 
A 


AAG 
K 


TAC 
Y 


CAG 

Q 


TGG 
W 


TGA 
* 












2097 
699 



TABLE 4 

DNA and Deduced Amino Acid Sequence of 
The Soluble Starch Synthase I Gene in Maize 
rSE0IDN0:12: SEP ID NO- H] 



FILE NAME : MSSIFULL.DNA 
CODON TABLE : UNIV.TCN 
SEQUENCE REGION : 1 
TRANSLATION REGION : 1 



SEQUENCE : NORMAL 1752 BP 

- 1752 

- 1752 



38 



20 



25 



40 



TGC GTC GCG GAG CTG AGC AGG GAG GGG CCC GCG CCG CGC CCG CTG CCA 48 
Cys Val Ala Glu Leu Ser Arg Glu Gly Pro Ala Pro Arg Pro Leu Pro 
700 705 710 715 

CCC GCG CTG CTG GCG CCC CCG CTC GIG CCC GGC TTC CTC GCG CCG CCG 9fi 
Pro Ala Leu Leu Ala Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro 
720 725 730 

GCC GAG CCC ACG GGT GAG CCG GCA TCG ACG CCG CCG CCC GTG CCC GAC 144 
Ala Glu Pro Thr Gly Glu Pro Ala Ser Thr Pro Pro Pro Val Pro Asp 
735 740 745 

GCC GGC CTG GGG GAC CTC GGT CTC GAA CCT GAA GGG ATT GCT GAA GGT 192 
Ala Gly Leu Gly Asp Leu Gly Leu Glu Pro Glu Gly He Ala Glu Gly 
750 755 760 

TCC ATC GAT AAC ACA GTA GTT GTG GCA. AGT GAG CAA GAT TCT GAG ATT 240 

IS i« "^^^ Ala-Ser Glu Gin Asp Ser Glu He 

765 770 775 



10 



GTG GTT GGA AAG GAG CAA GCT CGA GCT AAA GTA ACA CAA AGC ATT GTC 288 
Val Val Gly Lys Glu Gin Ala Arg Ala Lys Val Thr Gin Ser lie Val 
780 785 790 795 

So ^^"^ ^^'^ ^^'^ TCT GGG GGT CTA GGA 336 

Phe Val Thr Gly Glu Ala Ser Pro Tyr Ala Lys Ser Gly Gly Leu Gly 
800 805 810 

GAT GTT TGT GGT TCA TTG CCA GTT GCT CTT GCT GCT CGT GGT CAC CGT 384 
Asp Val Cys Gly Ser Leu Pro Val Ala Leu Ala Ala Arg Gly His Arg 
815 820 --- 



825 



vl? '^^ °GT ACC TCC GAT AAG AAT 

Val Met Val Val Met Pro Arg Tyr Leu Asn Gly Thr Ser Aso Lys Asn 
830 835 840 * 

TAT GCA AAT GCA TTT TAC ACA GAA AAA CAC ATT CGG ATT CCA TGC TTT 

in olf ''^y^ ^^'^ His He Arg He Pro Cys Phe 

845 850 855 

GGC GGT GAA CAT GAA GTT ACC TTC TTC CAT GAG TAT AGA GAT TCA GTT 
Gly Gly Glu His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Ser Val 
860 865 870 875 



905 



432 



480 



528 



IS ^ T'^T ^^"^ '^^T TCA TAT CAC AGA CCT GGA AAT TTA 576 

35 Asp Trp Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Asn Leu 

880 885 890 

TAT GGA GAT AAG TTT GGT GCT TTT GGT GAT AAT CAG TTC AGA TAC ACA 624 
Tyr Gly Asp Lys Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr 
895 900 



CTC CTT TGC TAT GCT GCA TGT GAG GCT CCT TTG ATC CTT GAA TTG GGA 672 
Leu Leu Cys Tyr Ala Ala Cys Glu Ala Pro Leu He Leu Glu Leu Gly 
910 915 920 ' 

GGA TAT ATT TAT GGA CAG AAT TGC ATG TTT GTT GTC AAT GAT TGG CAT 720 

45 ^ ^^"^ Val Val Asn Asp Trp His 

925 930 935 *^ *^ 

GCC AGT CTA GTG CCA GTC CTT CTT GCT GCA AAA TAT AGA CCA TAT GGT 768 
Ala Ser Leu Val Pro Val Leu Leu Ala Ala Lys Tyr Arq Pro Tvr Glv 

945 950 955 

Sn fr^T ^ TCC CGC AGC ATT CTT GTA ATA CAT AAT TTA GCA CAT 816 

Val Tyr Lys Asp Ser Arg Ser He Leu Val He His Asn Leu Ala His 
960 965 970 



39 



CAG GGT GTA GAG CCT GCA AGC AC A TAT OCT GAG CTT GGG TTG CCA CCT 864 
Gin Gly Val Glu Pro Ala Ser Thr Tyr Pro Asp Leu Gly Leu Pro Pro 
975 980 985 

GAA TGG TAT GGA GCT CTG GAG TGG GTA TTC CCT GAA TGG GCG AGG AGG 912 
5 Glu Trp Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu Trp Ala Arg Arg 
990 995 1000 

CAT GCC CTT GAC AAG GGT GAG GCA GTT AAT TTT TTG AAA GGT GCA GTT 960 
His Ala Leu Asp Lys Gly Glu Ala Val Asn Phe Leu Lys Gly Ala Val 
1005 1010 1015 

10 GTG ACA GCA GAT CGA ATC GTG ACT GTC AGT AAG GGT TAT TCG TGG GAG 1008 

Val Thr Ala Asp Arg lie Val Thr Val Ser Lys Gly Tyr Ser Trp Glu 
1020 1025 1030 1035 

GTC ACA ACT GCT GAA GGT GGA CAG GGC CTC AAT GAG CTC TTA AGC TCC 1056 
Val Thr Thr Ala Glu Gly Gly Gin Gly= Leu Asn Glu Leu Leu Ser Ser 
15 1040 1045 1050 

AGA AAG AGT GTA TTA AAC GGA ATT GTA AAT GGA ATT GAC ATT AAT GAT 1104 
Arg Lys Ser Val Leu Asn Gly lie Val Asn Gly lie Asp lie Asn Asp 
1055 1060 1065 

TGG AAC CCT GCC ACA GAC AAA TGT ATC CCC TGT CAT TAT TOT GTT GAT 1152 
20 Trp Asn Pro Ala Thr Asp Lys Cys lie Pro Cys His Tyr Ser Val Asp 
1070 1075 1080 

GAC CTC TCT GGA AAG GCC AAA TGT AAA GGT GCA TTG CAG AAG GAG CTG 1200 
Asp Leu Ser Gly Lys Ala Lys Cys Lys Gly Ala Leu Gin Lys Glu Leu 
1085 1090 1095 

25 GGT TTA CCT ATA AGG CCT GAT GTT CCT CTG ATT GGC TTT ATT GGA AGG 1248 

Gly Leu Pro lie Arg Pro Asp Val Pro Leu lie Gly Phe lie Gly Arg 
1100 1105 1110 1115 

TTG GAT TAT CAG AAA GGC ATT GAT CTC ATT CAA CTT ATC ATA CCA GAT 1296 
Leu Asp Tyr Gin Lys Gly lie Asp Leu lie Gin Leu lie lie Pro Aso 
30 1120 1125 1130 

CTC ATG CGG GAA GAT GTT CAA TTT GTC ATG CTT GGA TCT GGT GAC CCA 1344 
Leu Met Arg Glu Asp Val Gin Phe Val Met Leu Gly Ser Gly Asp Pro 
1135 1140 1145 



35 



GAG CTT GAA GAT TGG ATG AGA TCT ACA GAG TCG ATC TTC AAG GAT AAA 1392 
Glu Leu Glu Asp Trp Met Arg Ser Thr Glu Ser lie Phe Lys Asp Lys 
1150 1155 1160 

TTT CGT GGA TGG GTT GGA TTT AGT GTT CCA GTT TCC CAC CGA ATA ACT 1440 
Phe Arg Gly Trp Val Gly Phe Ser Val Pro Val Ser His Arg lie Thr 
1165 1170 1175 

40 GCC GGC TGC GAT ATA TTG TTA ATG CCA TCC AGA TTC GAA CCT TGT GGT 1488. 

Ala Gly Cys Asp lie Leu Leu Met Pro Ser Arg Phe Glu Pro Cys Gly 
1180 1185 1190 1195 

CTC AAT CAG CTA TAT GCT ATG CAG TAT GGC ACA GTT CCT GTT GTC CAT 1536 
Leu Asn Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val Pro Val Val His 
45 1200 1205 1210 

GCA ACT GGG GGC CTT AGA GAT ACC GTG GAG AAC TTC AAC CCT TTC GGT 1584 
Ala Thr Gly Gly Leu Arg Asp Thr Val Glu Asn Phe Asn Pro Phe Gly 
1215 1220 1225 

GAG AAT GGA GAG CAG GGT ACA GGG TGG GCA TTC GCA CCC CTA ACC ACA 1632 
50 Glu Asn Gly Glu Gin Gly Thr Gly Trp Ala Phe Ala Pro Leu Thr Thr 
1230 1235 1240 



40 



GAA AAC ATG TTT GTG GAG ATT GCG AAC TGC AAT ATC TAG ATA- GAG GGA 1680 
Glu Asn Met Phe Val Asp lie Ala Asn Cys Asn lie Tyr lie Gin Gly 
1245 1250 1255 

ACA CAA GTC CTC CTG GGA AGG GOT AAT GAA GCG AGG CAT GTC AAA AGA 1728 
5 Thr Gin Val Leu Leu Gly Arg Ala Asn Glu Ala Arg His Val Lys Arg 
1260 1265 1270 1275 

CTT CAC GTG GGA CCA TGC CGC TGA 1752 

Leu His Val Gly Pro Cys Arg * 
1280 

10 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 584 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Cys Val Ala Glu Leu Ser Arg Glu Gly Pro Ala Pro Arg Pro Leu Pro 
15 10 15 

Pro Ala Leu Leu Ala Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro 
20 20 25 30 

Ala Glu Pro Thr Gly Glu Pro Ala Ser Thr Pro Pro Pro Val Pro Asp 

35 40 45 

Ala Gly Leu Gly Asp Leu Gly Leu Glu Pro Glu Gly lie Ala Glu Gly 
50 55 60 

25 Ser lie Asp Asn Thr Val Val Val Ala Ser Glu Gin Aso Ser Glu lie 
65 70 75 ' 80 

Val Val Gly Lys Glu Gin Ala Arg Ala Lys Val Thr Gin Ser lie Val 
85 90 95 

Phe Val Thr Gly Glu Ala Ser Pro Tyr Ala Lys Ser Gly Gly Leu Gly 
30 100 105 110 

Asp Val Cys Gly Ser Leu Pro Val Ala Leu Ala Ala Arg Gly His Arg 
115 120 125 

Val Met Val Val Met Pro Arg Tyr Leu Asn Gly Thr Ser Asp Lys Asn 
130 135 140 

35 Tyr Ala Asn Ala Phe Tyr Thr Glu Lys His lie Arg lie Pro Cys Phe 

145 ISO 155 160 

Gly Gly Glu His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Ser Val 
165 170 175 

Asp Trp Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Asn Leu 
40 180 185 190 

Tyr Gly Asp Lys Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr 
195 200 205 

Leu Leu Cys Tyr Ala Ala Cys Glu Ala Pro Leu lie Leu Glu Leu Gly 
210 215 220 

45 Gly Tyr lie Tyr Gly Gin Asn Cys Met Phe Val Val Asn Asp Trp His 

225 230 235 240 



41 



Ala Ser Leu VaL Pro Val Leu Leu Ala Ala Lys Tyr Arg Pro Tyr Gly 
245 250 255 

Val Tyr Lys Asp Ser Arg Ser lie Leu Val lie His Asn Leu Ala His 
260 265 270 

5 Gin Gly Val Glu Pro Ala Ser Thr Tyr Pro Asp Leu Gly Leu Pro Pro 
275 280 285 

Glu Trp Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu Trp Ala Arg Arg 
290 295 300 

His Ala Leu Asp Lys Gly Glu Ala Val Asn Phe Leu Lys Gly Ala Val 
10 305 310 315 320 

Val Thr Ala Asp Arg He Val Thr Val Ser Lys Gly Tyr Ser Trp Glu 
325 ^ 330 335 

Val Thr Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu Leu Leu Ser Ser 
340 345 350 

15 Arg Lys Ser Val Leu Asn Gly He Val Asn Gly He Asp He Asn Asp 
355 360 365 

Trp Asn Pro Ala Thr Asp Lys Cys He Pro Cys His Tyr Ser Val Asp 
370 375 380 

Asp Leu Ser Gly Lys Ala Lys Cys Lys Gly Ala Leu Gin Lys Glu Leu 
20 385 390 395 400 

Gly Leu Pro He Arg Pro Asp Val Pro Leu He Gly Phe He Gly Arg 
405 " 410 415 

Leu Asp Tyr Gin Lys Gly He Asp Leu He Gin Leu He He Pro Asp 
420 425 430 

25 Leu Met Arg Glu Asp Val Gin Phe Val Met Leu Gly Ser Gly Asp Pro 
435 440 445 

Glu Leu Glu Asp Trp Met Arg Ser Thr Glu Ser He Phe Lys Aso Lys 
450 455 460 

Phe Arg Gly Trp Val Gly Phe Ser Val Pro Val Ser His Arg He Thr 
30 465 470 475 480 

Ala Gly Cys Asp He Leu Leu Met Pro Ser Arg Phe Glu Pro Cys Gly 
485 490 495 

Leu Asn Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val Pro Val Val His 
500 505 510 

35 Ala Thr Gly Gly Leu Arg Asp Thr Val Glu Asn Phe Asn Pro Phe Gly 
515 520 525 

Glu Asn Gly Glu Gin Gly Thr Gly Trp Ala Phe Ala Pro Leu Thr Thr 
530 535 540 

Glu Asn Met Phe Val Asp He Ala Asn Cys Asn He Tyr He Gin Gly 
40 545 550 555 560 

Thr Gin Val Leu Leu Gly Arg Ala Asn Glu Ala Arg His Val Lvs Arg 
565 570 575 

Leu His Val Gly Pro Cys Arg * 
580 
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TABLE 5 

mRNA Sequence and Deduced Amino Acid Sequence of 
The Maize Branching Enzvme II Gene and the Transit Peptide 
rSEO ID NO: 14 and SEP ID NO: 151 



10 



15 



20 



25 



30 



35 



40 



LOCUS MZEGLUCTRN 2725 bp ss-mRNA PLN 

DEFINITION Corn starch branching enzyme II mRNA, complete cds. 
ACCESSION L08065 

KEYWORDS 1, 4~alpha-glucan branching enzyme; amylo-transglycosylase; 

giucanotransf erase; starch branching enzyme II. 
SOURCE Zea mays cDNA to mRNA. 

ORGANISM Zea mays 

Eukaryota; Plantae; Embryobionta; Magnoliophyta; Liliopsida; 
Commelinidae; Cyperales; Poaceae. 
REFERENCE 1 (bases 1 to 2725) 

AUTHORS Fisher, D.K., Boyer, C. D /-and Hannah, L.C. 
TITLE Starch branching enzyme II from maize endosperm 
JOURNAL Plant Physiol. 102, 1045-1046 (1993) 
STANDARD full automatic 
COMMENT NCBI gi: 168482 

FEATURES Location/Qualifiers 
source 1..2725 

/cultivar="W64Axl82S" 
/dev_stage=*'29 days post pollenation" 
/tissue_type=" endosperm" 
/organism="Zea mays" 
sig_peptide 91. .264 

/codon_start=l 
CDS 91.. 2490 

/EC_number='*2 . 4 . 1 . 18" 
/note="NCBI gi: 168483" 
/codon_start=l 

/products"starch branching enzyme II" 
/translation="MAFRVSGAVLGGAVRA?RLTGGGEGSLVFRHTGLFLTRGARVGC 
SGTHGAMRAAAAARKAVMVPEGENDGLASRADSAQF QSDELEVPDISEETTCGAGVAD 
AQALNRVRWPPPSDGQKIFQIDPMLQGYKYHLEYRySLYRRIRSDIDSHEGGLEAFS 
RSYEKFGFNASAEGITYREWAPGAFSAALVGDVNNWDPNADRMSKNEFGVWEIFLPNN 



ADGTSPIPHGSRVKVRMDTPSGIKDSIPAWIKYSVQAPGEIPYDGIYYDPPEEVKYVF 
RHAQPKRPKSLRIYETHVGMSSPEPKINTYVNFRDEVLPRIKKLGYNAVQIMAIQEHS 

45 

YYGSFGYHVTNFFAPSSRFGTPEDLKSLIDRAHELGLLVLMDWHSHASSNTLDGLNG 

FDGTDTHYFHSGPRGHHWMWDSRLFNYGNWEVLRFLLSNARWWLEEYKFDGFRFDGVT 

50 SMMYTHHGLQVTFTGNFNEYFGFATDVDAWYLMLVNDLIHGLYPEAVTIGEDVSGMP 

TFALPVHDGGVGFDYRMHMAVADKWTDLLKQSDETWXMGDIVHTLTNRRWLEKCVTYA 

ESHDQALVGDKTIAFWLMDKDMYDFMALDRPSTPTIDRGIALHKMIRLITMGLGGEGY 

LNFMGNEFGHPEWIDFPRGPQRLPSGKFIPGNNNSYDKCRRRFDLGDADYLRYHGMQE 

FDQAMQHLEQKYEFMTSDHQYISRKHEEDKVIVFEKGDLVFVFNFHCNNSYFDYRIGC 

60 RKPGVYKWLDSDAGLFGGFSRIHHAAEHFTADCSHDNRPYSFSVYTPSRTCWYAPV 

E" 

mat^peptide 265 . . 2487 

/ codon_start=l 

/product="starch branching enzyme II" 
65 BASE COUNT 727 A 534 C 715 G 749 T 
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ORIGIN 

1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 

II 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



GGCCCAGAGC 
AGTTCGATCC 
GGTGGGGCCG 
CACACCGGCC 
ATGCGCGCGG 
CTCGCATCAA 
TCTGAAGAGA 
GTGGTCCCCC 
TATAAGTACC 
GAACATGAAG 
AGCGCGGAAG 
GGTGACGTCA 
TGGGAAATTT 
GTAAAGGTGA 
TACTCAGTGC 
GAGGTAAAGT 
GAAACACATG 
GATGAAGTCC 
CAAGAGCACT 
AGTCGTTTTG 
TTGCTAGTTC 
AATGGTTTTG 
ATGTGGGATT 
AATGCTAGAT 
TCCATGATGT 
TTTGGCTTTG 
CATGGACTTT 
GCCCTTCCTG 
GACAAATGGA 
CACACACTGA 
CAAGCATTAG 
TTCATGGCCC 
ATGATTAGAC 
GAGTTTGGAC 
AAGTTTATTC 
GATGCAGACT 
GAGCAAAAAT 
GATAAGGTGA 
AACAGCTATT 
GACTCCGACG 
ACCGCCGACT 
ACATGTGTCG 
GTGGGGCTGT 
CTACAATAAG 
TCCTCTCTAT 
CTTTCCTAAA 



AGACCCGGAT 
GATCCGGCTG 
TAAGGGCTCC 
TCTTCTTAAC 
CGGCCGCGGC 
GGGCTGACTC 
CAACGTGCGG 
CACCAAGCGA 
ATCTTGAGTA 
GAGGCTTGGA 
GTATCACATA 
ACAACTGGGA 
TTCTGCCTAA 
GAATGGATAC 
AGGCCCCAGG 
ATGTGTTCAG 
TCGGAATGAG 
TCCCAAGAAT 
CATATTATGG 
GTACCCCAGA 
TCATGGATGT 
ATGGTACAGA 
CTCGCCTATT 
GGTGGCTCGA 
ACACTCACCA 
CCACCGATGT 
ATCCTGAGGC 
TTCACGATGG 
TTGACCTTCT 
CAAATAGGAG 
TCGGCGACAA 
TCGATAGACC 
TTATCACAAT 
ATCCTGAATG 
CAGGGAATAA 
ATCTTAGGTA 
ATGAATTCAT 
TTGTGTTCGA 
TTGACTACCG 
CTGGACTATT 
GTTCGCATGA 
TCTATGCTCC 
CGATGTGAGG 
GTTCTGATAC 
ATATATAAGA 
AAAAAAAAAA 



TTCGCTCTTG 
CGAAGGCGAG 
CCGACTCACC 
TCGGGGTGCT 
CAGGAAGGCG 
GGCTCAATTC 
TGCTGGTGTG 
TGGACAAAAA 
TCGGTACAGC 
AGCCTTCTCC 
TCGAGAATGG 
TCCAAATGCA 
CAATGCAGAT 
TCCATCAGGG 
AGAAATACCA 
GCATGCGCAA 
TAGbcCGGAA 
AAAAAAACTT 
AAGCTTTGGA 
AGATTTGAAG 
GGTTCATAGT 
TACACATTAC 
TAACTATGGG 
GG AATATAAG 
CGGATTACAA 
AGATGCAGTG 
TGTAACCATT 
TGGGGTAGGT 
CAAGCAAAGT 
GTGGTTAGAG 
GACTATTGCG 
TTCAACTCCT 
GGGTTTAGGA 
GATAGATTTT 
CAACAGTTAT 
TCATGGTATG 
GACATCTGAT 
AAAGGGAGAT 
TATTGGTTGT 
TGGTGGATTT 
TAATAGGCCA 
AGTGGAGTGA 
AAAAACCTTC 
TTTAATCGAT 
CCTTCAAGGT 
AAAAA 



CGGTCGCTGG 
ATGGCGTTCC 
GGCGGCGGGG 
CGAGTTGGAT 
GTCATGGTTC 
CAGTCGGATG 
GCTGATGCTC 
ATATTCCAGA 
CTCTATAGAA 
CGTAGTTATG 
GCTCCTGGAG 
GATCGTATGA 
GGTACATCAC 
ATAAAGGATT 
TATGATGGGA 
CCTAAACGAC 
CCGAAGATAA 
GGATACAATG 
TACCATGTAA 
TCTTTGATTG 
CATGCGTCAA 
TTTCACAGTG 
AACTGGGAAG 
TTTGATGGTT 
GTAACATTTA 
GTTTACTTGA 
GGTGAAGATG 
TTTGACTATC 
GATGAAACTT 
AAGTGTGTAA 
TTTTGGTTGA 
ACCATTGATC 
GGAGAGGGCT 
CCAAGAGGTC 
GACAAATGTC 
CAAGAGTTTG 
CACCAGTATA 
TTGGTATTTG 
CGAAAGCCTG 
AGCAGGATCC 
TATTCATTCT 
TAGCGGGGTA 
TTCCAAAACC 
GCTGGAAAGC 
GTCAATTAAA 



GGTTTTAGCA 
GGGTTTCTGG 
AGGGTAGTCT 
GTTCGGGGAC 
CTGAGGGCGA 
AACTGGAGGT 
AAGCCTTGAA 
TTGACCCCAT 
GAATCCGTTC 
AGAAGTTTGG 
CATTTTCTGC 
GCAAAAATGA 
CTATTCCTCA 
CAATTCCAGC 
TTTATTATGA 
CAAAATCATT 
ACACATATGT 
CAGTGCAAAT 
CTAATTTTTT 
ATAGAGCACA 
GTAATACTCT 
GTCCACGTGG 
TTTTAAGATT 
TCCGTTTTGA 
CGGGGAACTT 
TGCTGGTAAA 
TTAGTGGAAT 
GGATGCATAT 
GGAAGATGGG 
CTTATGCTGA 
TGGACAAGGA 
GTGGGATAGC 
ATCTTAATTT 
CGCAAAGACT 
GTCGAAGATT 
ATCAGGCAAT 
TTTCCCGGAA 
TGTTCAACTT 
GGGTGTATAA 
ATCACGCAGC 
CGGTTTATAC 
CTCGTTGCTG 
GGCAGATGCA 
CCATGCATCT 
CATAGAGTTT 



TTGGCTGATC 
GGCGGTGCTC 
AGTCTTCCGG 
GCACGGGGCC 
GAATGATGGC 
ACCAGACATT 
CAGAGTTCGA 
GTTGCAAGGC 
AGACATTGAT 
ATTTAATGCC 
AGCATTGGTG 
GTTTGGTGTT 
TGGATCTCGT 
CTGGATCAAG 
TCCTCCTGAA 
GCGGATATAT 
AAACTTTAGG 
AATGGCAATC 
TGCGCCAAGT 
TGAGCTTGGT 
GGATGGGTTG 
CCATCACTGG 
TCTTCTCTCC 
TGGTGTGACC 
CAATGAGTAT 
TGATCTAATT 
GCCTACATTT 
GGCTGTGGCT 
TGATATTGTG 
AAGTCATGAT 
TATGTATGAT 
ATTACATAAG 
CATGGGAAAT 
TCCAAGTGGT 
TGACCTGGGT 
GCAACATCTT 
ACATGAGGAG 
CCACTGCAAC 
GGTGGTCTTG 
CGAGCACTTC 
ACCAAGCAGA 
CGCGGCATGT 
TGCATGCATG 
CGCTGCGTTG 
TCGTTTTTCG 



TABLE 6 

mRNA Sequence and Deduced Amino Acid Sequence of the 
Maize Branching Enzyme I and the Transit Peptide 
fSEO ID NO: 16 and SEP ID NO: 171 

MZEBEI 2763 bp ss-mKNA PLN 

Maize mRNA for branching enzyme-I (BE-I). 
D11081 

branching enzyme-I. 

Zea mays L. (inbred Oh43), cDNA to mRNA. 
Zea mays 

Eukaryota; Plantae; Embryobionta; Magnoliophyta; Liliopsida; 
Comme 1 inidae ; L i 1 iops ida . 
1 (bases 1 to 2763) 

Baba,T., Kimura,K./ Mizuno^K., Etoh,H., Ishida,Y., Shida,0. and 
Arai,y . 
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TITLE Sequence conservation of the catalytic regions of AmyloLytic* 

enzymes in maize branching enzyme^I 
JOURNAL Biochem. Biophys. Res. Commun. 181, 87-94 (1991) 
STANDARD full automatic 
COMMENT Submitted ( 30-APR-1992 ) to DDBJ by: Tadashi Bafta 

Institute of Applied Biochemistry 

University of Tsukuba 

Tsukuba, Ibaraki 305 

Japan 

Phone: 0298-53-6632 
Fax: 0298-53-6632. 



NCBI gi: 217959 
FEATURES Location/Qualifiers 
source 1..2763 

/organisms" Zea mays" 
CDS <1..2470 

/note="NCBI gis 217960" 
/codon_start=2 

/product="branching enzyme-I precursor" 
/translation="LCLVSPSSSPTPLPPPRRSRSKADRAAPPGIAGGGNVRLSVLSV 
QCKARRSGVRKVKSKFATAATVQEDKTMATAXGDVDHLPIYDLDPKLEIFKDHFRYRM 
KRFLSQKGSISENEGSLESFSKGYLKFGINTNEDGTVYREWAPAAQEA2LIGDFNDWN 
GANHKMEKDKFGVWSIKIDHVKGKPAIPHNSKVKFRFLKGGVWVDRIPALIRYATVDA 



SKFGAPYDGVHWDPPASERYTFKHPRPSKPAAPRIYEAKVGMSGEKPAVSTYREFADN 
VLPRIRANNYNTVQLMAVMEHSYYASFGYHVTNFrAVSSRSGTPEDLKYLVDKAHSLG 
LRVLMDWKSHASNNVTDGLNGYDVGQSTQESYFKAGDRGYKKLWDSRLFNYANWEVL 
RFLLSNLRYWLDEFMFDGFRFDGVTSMLYHHHGIN^/GFTGNYQEYFSLDTAVDAWYM 
MLANHLMHKLLPEATWAEDVSGMPVLCRPVDEGGVGFDYRLAMAIPDRWIDYLKNKD 



DSEWSMGEIAHTLTNRRYTEKCIAYASSHDQSIVGDKTIAFLLMDKEMYTGMSDLQPA 
SPTIDRGIALQKMIHFITMALGGDGYLNFMGNEFGKPSWIDFPREGNNWSYDKCRRQW 
SLVDTDHLRYKYMNAFDQAMNALDERFSFLSSSXQIVSDMNDEEKVIVFERGDLVFVF 
NFHPKKTYEGYKVGCDLPGKYRVALDSDALVFGGHGRVGHDVDKFTSPEGVPGVPETN 

FNNRPNSFKVLSPPRTCVAYYRVDEAGAGRRLHAXAETGXTSPAESIDVKASRASSKE 

DKEATAGGKXGWKFARQPSDQDTK" 
trans it_peptide 2 . . 190 
mat_peptide 191 -.2467 

/EC_number=" 2 . 4 . 1 . 18" 

/codon_start=l 

/product="branching enzyme-I precursor" 
polyA^signal 2734 . . 2739 
BASE COUNT 719 A 585 C 737 G 722 T 

ORIGIN 

1 GCTGTGCCTC GTGTCGCCCT CTTCCTCGCC GACTCCGCTT CCGCCGCCGC GGCGCTCTCG 
61 CTCGCATGCT GATCGGGCGG CACCGCCGGG GATCGCGGGT GGCGGCAATG TGCGCCTGAG 
121 TGTGTTGTCT GTCCAGTGCA AGGCTCGCCG GTCAGGGGTG CGGAAGGTCA AGAGCAAATT 
181 CGCCACTGCA GCTACTGTGC AAGAAGATAA AACTATGGCA ACTGCCAAAG GCGATGTCGA 
241 CCATCTCCCC ATATACGACC TGGACCCCAA GCTGGAGATA TTCAAGGACC ATTTCAGGTA 
301 CCGGATGAAA AGATTCCTAG AGCAGAAAGG ATCAATTGAA GAAAATGAGG GAAGTCTTGA 
3 61 ATCTTTTTCT AAAGGCTATT TGAAATTTGG GATTAATACA AATGAGGATG GAACTGTATA 
421 TCGTGAATGG GCACCTGCTG CGCAGGAGGC AGAGCTTATT GGTGACTTCA ATGACTGGAA 
481 TGGTGCAAAC CATAAGATGG AGAAGGATAA ATTTGGTGTT TGGTCGATCA AAATTGACCA 
541 TGTCAAAGGG AAACCTGCCA TCCCTCACAA TTCCAAGGTT AAATTTCGCT TTCTACATGG 
601 TGGAGTATGG GTTGATCGTA TTCCAGCATT GATTCGTTAT GCGACTGTTG ATGCCTCTAA 
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661 ATTTGGAGCT CCCTATGATG GTGTTCATTG GGATCCTCCT GCTTCTGAAA GGTACACATT 
721 TAAGCATCCT CGGCCTTCAA AGCCTGCTGC TCCACGTATC TATGAAGCCC ATGTAGGTAT 
781 GAGTGGTGAA AAGCCAGCAG TAAGCACATA TAGGGAATTT GCAGACAATG TGTTGCCACG 
841 CATACGAGCA AATAACTACA ACACAGTTCA GTTGATGGCA GTTATGGAGC ATTCGTACTA 
5 901 TGCTTCTTTC GGGTACCATG TGACAAATTT CTTTGCGGTT AGCAGCAGAT CAGGCACACC 

961 AGAGGACCTC AAATATCTTG TTGATAAGGC ACACAGTTTG GGTTTGCGAG TTCTGATGGA 
1021 TGTTGTCCAT AGCCATGCAA GTAATAATGT CACAGATGGT TTAAATGGCT ATGATGTTGG 
1081 ACAAAGCACC CAAGAGTCCT ATTTTCATGC GGGAGATAGA GGTTATCATA AACTTTGGGA 
1141 TAGTCGGCTG TTCAACTATG CTAACTGGGA GGTATTAAGG TTTCTTCTTT CTAACCTGAG 

10 1201 ATATTGGTTG GATGAATTCA TGTTTGATGG CTTCCGATTT GATGGAGTTA CATCAATGCT 

1261 GTATCATCAC CATGGTATCA ATGTGGGGTT TACTGGAAAC TACCAGGAAT ATTTCAGTTT 
1321 GGACACAGCT GTGGATGCAG TTGTTTACAT GATGCTTGCA AACCATTTAA TGCACAAACT 
1381 CTTGCCAGAA GCAACTGTTG TTGCTGAAGA TGTTTCAGGC ATGCCGGTCC TTTGCCGGCC 
1441 AGTTGATGAA GGTGGGGTTG GGTTTGACTA TCGCCTGGCA ATGGCTATCC CTGATAGATG 

15 1501 GATTGACTAC CTGAAGAATA AAGATGACTC TGAGTGGTCG ATGGGTGAAA TAGCGCATAC 

1561 TTTGACTAAC AGGAGATATA CTGAAAAATG CATCGCATAT GCTGAGAGCC ATGATCAGTC 
1621 TATTGTTGGC GACAAAACTA TTGCATTTCT CCTGATGGAC AAGGAAATGT ACACTGGCAT 
1681 GTCAGACTTG CAGCCTGCTT CACCTACAAT TGATCGAGGG ATTGCACTCC AAAAGATGAT 
1741 TCACTTCATC ACAATGGCCC TTGGAGGTGA TGGCTACTTG AATTTTATGG GAAATGAGTT 

20 1801 TGGTCACCCA GAATGGATTG ACTTTCCAAG AGAAGGGAAC AACTGGAGCT ATGATAAATG 

1861 CAGACGACAG TGGAGCCTTG TGGACACTGA TCACTTGCGG TACAAGTACA TGAATGCGTT 
1921 TGACCAAGCG ATGAATGCGC TCGATGAGAG ATTTTCCTTC CTTTCGTCGT CAAAGCAGAT 
1981 CGTCAGCGAC ATGAACGATG AGGAAAAGGT TATTGTCTTT GAACGTGGAG ATTTAGTTTT 
2041 TGTTTTCAAT TTCCATCCCA AGAAAACTTA CGAGGGCTAC AAAGTGGGAT GCGATTTGCC 

25 2101 TGGGAAATAC AGAGTAGCCC TGGACTCTGA TGCTCTGGTC TTCGGTGGAC ATGGAAGAGT 

2161 TGGCCACGAC GTGGATCACT TCACGTCGCC TGAAGGGGTG CCAGGGGTGC CCGAAACGAA 
2221 CTTCAACAAC CGGCCGAACT CGTTCAAAGT CCTTTCTCCG CCCCGCACCT GTGTGGCTTA 
2281 TTACCGTGTA GACGAAGCAG GGGCTGGACG ACGTCTTCAC GCGAAAGCAG AGACAGGAAA 
2341 GACGTCTCCA GCAGAGAGCA TCGACGTCAA AGCTTCCAGA GCTAGTAGCA AAGAAGACAA 

30 2401 GGAGGCAACG GCTGGTGGCA AGAAGGGATG GAAGTTTGCG CGGCAGCCAT CCGATCAAGA 

2461 TACCAAATGA AGCCACGAGT CCTTGGTGAG GACTGGACTG GCTGCCGGCG CCCTGTTAGT 
2521 AGTCCTGCTC TACTGGACTA GCCGCCGCTG GCGCCCTTGG AACGGTCCTT TCCTGTAGCT 
2581 TGCAGGCGAC TGGTGTCTCA TCACCGAGCA GGCAGGCACT GCTTGTATAG CTTTTCTAGA 
2641 ATAATAATCA GGGATGGATG GATGGTGTGT ATTGGCTATC TGGCTAGACG TGCATGTGCC 

35 2701 CAGTTTGTAT GTACAGGAGC AGTTCCCGTC CAGAATAAAA AAAAACTTGT TGGGGGGTTT 

2761 TTC 

" TABLE 7 

Coding Sequence and Deduced Amino Acid Sequence for 
40 Transit Peptide Region of the 

Soluble Starch Synthase I Maize Gene (153 bp) 
[SEP ID NO: 18 and SEP ID NO: 191 

FILE NAME : MSSITRPT.DNA SEQUENCE : NORMAL 153 BP 
CODON TABLE : UNIV.TCN 
45 SEQUENCE REGION : 1 - 153 

TRANSLATION REGION : 1 - 153 

*** DNA TRANSLATION *** 

1 ATG GCG ACG CCC TCG GCC GTG GGC GCC GCG TGC CTC CTC CTC GCG CGG 48 
1. MATPSAVGAACLLLAR 16 

50 49 GCC GCC TGG CCG GCC GCC GTC GGC GAC CGG GCG CGC CCG CGG AGG CTC 96 

17AAWPAAVGDRARPRRL 32 

97 CAG CGC GTG CTG CGC CGC CGG TGC GTC GCG GAG CTG AGC AGG GAG GGG 144 
33QRVLRRRCVAELSREG 48 



145 CCC CAT ATG 
55 49 P H M 



153 
51 
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GFP constructs: 

1. GFP only in pET-21a: 

pEXS115 is digested with Nde I and Xho I and the 740 bp fragment containing the 
SGFP coding sequence is subcloned into the Nde I and Xho I sites of pET-21a (Novagen 601 
Science Dr. Madison WI). (See FIG. 2b GFP-21a map.) 

2. GFP subcloned in-frame at the 5* end of full-length mature WX: 

The 740 bp Nde I fragment containing SGFP from pEXSI14 is subcloned into the Nde 
I site of pEXSWX. (See FIG.3a GFP-FLWX map.) 

3. GFP subcloned in-frame at the 5' end of N-terminally truncated WX: 

WX truncated by 700 bp at N-terminus. 

The 1 kb BaniR I fragment encoding the C-terminus of WX from pEXSWX is 
subcloned into the Bgl II site of pEXS115. Then the entire SGFP-truncated WX fragment is 
subcloned into pET21a as a M/eI-//mdIII fragment. (See FIG. 3b GFP-BamHIWX map.) 

4. GFP subcloned in-frame at the 5' end of truncated WX: WX truncated by 100 bp at N- 
terminus. 

The 740 bp Nde l-Nco I fragment containing SGFP from pEXSlIS is subcloned into 
pEXSWX at the Nde I and Nco I sites. (See Fig. 4 GFP-NcoWX map.) 

Example Three: 

Plasmid Transformation into Bacteria: 

Escherichia coli competent cell preparation: 

1. Inoculate 2.5 ml LB media with a single colony of desired E. coli strain : 
selected strain was XLIBLUE DL2IDE3 from (Stratagene); included appropriate antibiotics. 
Grow at 37^*0, 250 rpm overnight. 

2. Inoculate 100 ml of LB media with a 1:50 dilution of the overnight culture, 
including appropriate antibiotics. Grow at 37°C, 250 rpm until OD5oo=0-3-0.5. 

3. Transfer culture to sterile centrifuge bottle and chill on ice for 15 minutes. 
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4. Centrifuge 5 minutes at 3,000x g (4°C). 



5. Resuspend pellet in 8 ml ice-cold Transformation buffer. Incubate on ice for 
15 minutes. 

6. Centrifuge 5 minutes at 3,000x g (4°C). 

7. Resuspend pellet in 8 ml ice-cold Transformation buffer 2. Aliquot, flash- 
freeze in liquid nitrogen, and stored at -70-°C. 
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Transformation Buffer 1 

RbCl 1.2 g 

MnCl2 4H2O 0.99g 

K-Acetate 0.294 g 

CaCl2 2H20 0.15 g 

Glycerol 15 g 

dH20 100 ml 

pH to 5.8 with 0.2 M acetic acid 

Filter sterilize 



Transformation Buffer 2 
MOPS (10 miM) 0.209 g 
RbCl 0.12 g 

CaCl2 2H20 1.1 g 

Glycerol 15 g 

dHzO 100 ml 

pH to 6.8 with NaOH 
Filter sterilize 



Escherichia coli transformation by rubidium chloride heat shock method: Hanahan, D. 
(1985) in DNA cloning: a practical approach (Glover, D.M. ed.), pp. 109-135, IRL Press. 

1. Incubate 1-5 of DNA on ice with 150 /xl E. coli competent cells for 30 
minutes. 

20 2. Heat shock at 42°C for 45 seconds. 

3. Immediately place on ice for 2 minutes. 

4. Add 600 ftl LB media and incubate at 37^ for I hour. 
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S. Plate on LB agar including the appropriate antibiotics. 



This plasmid will express the hybrid polypeptide containing the green fluorescent 
protein within the bacteria. 

Example Four: 
S Expression of Construct in E. coli: 

1. Inoculate 3 ml LB with E. coli containing plasmid of interest. Include appropriate 
antibiotics. ST^'C, 250 rpm, overnight. 

2. Inoculate 100 ml LB with 2 ml of overnight culture. Include appropriate antibiotics. 
Grow at 3TC, 250 rpm. 

10 3. At OD^QQ about 0.4-0.5, place at room temperature, 200 rpm. 

4. At OD^oo about 0.6-0.8, induce with 100 jxl IM IPTG. Final IPTG concentration is I 
mM. 

5. Grow at room temperature, 200 rpm, 4-5 hours. 

6. Collect cells by centrifugation. 

15 7. Flash freeze in liquid nitrogen and store at -VO^C until use. 

Cells can be resuspended in dH20 and viewed under UV light (A^jj^ax " ""^) 
intrinsic fluorescence. Alternatively, the cells can be sonicated and an aliquot of the cell 
extract can be separated by SDS-PAGE and viewed under UV light to detect GFP 
fluorescence. When the protein employed is a green fluorescent protein, the presence of the 
20 protein in the lysed material can be evaluated under UV at 395 nm in a light box and the 
signature green glow can be identified. 
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Example Five: 

Plasmid Extraction from Bacteria: 

The following is one of many common alkaline lysis plasmid purification protocols 
useful in practicing this invention. 

1. Inoculate 100-200 ml LB media with a single colony of £. coli transformed with the 
one of the plasmids described above. Include appropriate antibiotics. Grow at ST'^C, 
250 rpm overnight. 

2. Centrifuge 10 minutes at 5,000x g (4''C). 

3. Resuspend cells in 10 ml water, transfer to a 15 ml centrifuge tube, and repeat 
centrifugation. 

4. Resuspend pellet in 5 ml 0.1 M NaOH, 0.5% SDS. Incubate on ice for 10 minutes. 

5. Add 2.5 ml of 3 M sodium acetate (pH 5.2), invert gently, and incubate 10 minutes on 
ice, 

6. Centrifuge 5 minutes at 15, 000-20, OOOx g (4^C). 

7. Extract supernatant with an equal volume of phenol:chloroform:isoamyl alcohol 
(25:24:1). 

8. Centrifuge 10 minutes at 6,000-10,000x g (4"C). 

9. Transfer aqueous phase to clean tube and precipitate with 1 volume of isopropanol. 

10. Centrifuge 15 minutes at 12,000x g (4^C). 

11. Dissolve pellet in 0.5 ml TE, add 20 ^1 of 10 mg/ml Rnase, and incubate 1 hour at 
37^C. 
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12. Extract twice with phenol:chloroform:isoamyl alcohol (25:24: 1). 

13. Extract once with chloroform. 

14. Precipitate aqueous phase with 1 volume of isopropanol and 0. 1 volume of 3 M 
sodium acetate. 

15. Wash pellet once with 70% ethanol. 

16. Dry pellet in Speed Vac and resuspend pellet in TE. 
This plasmid can then be inserted into other hosts. 



TABLE 8 

DNA Sequence and Deduced Amino Acid Sequence of 
10 Starch Synthase Coding Region from pEXS52 TSEO ID N O'2n: SEP ID N0:211 

FILE NAME : MSSIDELN^DNA SEQUENCE : NOR24AL 1626 BP 

CODON TABLE : UNIV. TON 

SEQUENCE REGION : 1 - 1626 

TRANSLATION REGION : 1 - 1626 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

TGC GTC GCG GAG CTG AGO AGG GAG GAG CTC GGT CTC GAA CCT GAA GGG 48 
Cys Val Ala Glu Leu Ser Arg Glu Asp Leu Gly Leu Glu Pro Glu Gly 
55 60 65 

ATT GOT GAA GGT TCC ATC GAT AAC ACA GTA GTT GTG GCA AGT GAG CAA 96 
20 He Ala Glu Gly Ser He Asp Asn Thr Val Val Val Ala Ser Glu Gin 
70 75 80 

GAT TOT GAG ATT GTG GTT GGA AAG GAG CAA GCT CCA GCT AAA GTA ACA 144 
Asp Ser Glu lie Val Val Glv Lys Glu Gin Ala Arg Ala Lys Val Thr 
85 90 95 

25 CAA AGC ATT GTC TTT GTA ACC GGC GAA GCT TCT CCT TAT GCA AAG TCT 192 

Gin Ser He Val Phe Val Thr Gly Glu Ala Ser Pro Tyr Ala Lys Ser 
100 105 110 115 

GGG GGT CTA GGA GAT GTT TGT GGT TCA TTG CCA GTT GCT CTT GCT GCT 240 
Gly Glv Leu Glv Asp Val Cys Gly Ser Leu Pro Val Ala Leu Ala Ala 
30 120 125 * 130 

CGT GGT CAC CGT GTG ATG GTT GTA ATG CCC AGA TAT TTA AAT GGT ACC 288 
Arg Gly His Arg Val Met Val Val Met Pro Arg Tyr Leu Asn Gly Thr 
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135 140 145 

TCC GAT AAG ?J^T TAT GCA AAT GCA TTT TAG AC A GAA AAA GAG ATT CGG 336 
Ser Asp Lys Asn Tyr Ala Asn Ala Phe Tyr Thr Giu Lys His lie Arg 
150 155 160 

5 ATT CCA TGC TTT GGC GGT GAA CAT GAA GTT ACC TTC TTC CAT GAG TAT 384 

lie Pro Cys Phe Gly Gly Glu His Glu Val Thr Phe Phe His Glu Tyr 
165 170 175 

AGA GAT TCA GTT GAC TGG GTG TTT GTT GAT CAT CCC TCA TAT CAC AGA 432 
Arg Asp Ser Val Asp Trp Val Phe Val Asp His Pro Ser Tyr His Arg 
10 180 185 190 195 

CCT GGA AAT TTA TAT GGA GAT AAG TTT GGT GCT TTT GGT GAT AAT GAG 480 
Pro Gly Asn Leu Tyr Gly Asp Lys Phe Gly Ala Phe Gly Asp Asn Gin 
200 , 205 210 

TTC AGA TAG ACA CTC CTT TGC TAT GCT GCA TGT GAG GCT CCT TTG ATC 528 
15 Phe Arg Tyr Thr Leu Leu Cys Tyr Ala Ala Cys Glu Ala Pro Leu lie 

215 220 225 

CTT GAA TTG GGA GGA TAT ATT TAT GGA CAG AAT TGC ATG TTT GTT GTC 576 
Leu Glu Leu Gly Gly Tyr He Tyr Gly Gin Asn Cys Met Phe Val Val 
230 235 240 



20 AAT GAT TGG CAT GCC AGT CTA GTG CCA GTC CTT CTT GCT GCA AAA TAT 624 

Asn Asp Trp His Ala Ser Leu Val Pro Val Leu Leu Ala Ala Lys Tyr 
245 250 255 

AGA CCA TAT GGT GTT TAT AAA GAC TCC CGG AGC ATT CTT GTA ATA CAT 672 
Arg Pro Tyr Gly Val Tyr Lys Asp Sar Arg Ser He Leu Val He His 
25 260 265 270 275 

AAT TTA GCA CAT CAG GGT GTA GAG CCT GCA AGC ACA TAT CCT GAC CTT 720 
Asn Leu Ala His Gin Gly Val Glu Pro Ala Ser Thr Tyr Pro Asp Leu 
280 285 290 

GGG TTG CCA CCT GAA TGG TAT GGA GCT GTG GAG TGG GTA TTC CCT GAA 768 
30 Gly Leu Pro Pro Glu Trp Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu 

295 300 305 

TGG GCG AGG AGG CAT GCC CTT GAC AAG GGT GAG GCA GTT AAT TTT TTG 816 
Trp Ala Arg Arg His Ala Leu Aso Lys Gly Glu Ala Val Asn Phe Leu 
310 315 320 

35 AAA GGT GCA GTT GTG ACA GCA GAT GGA ATC GTG ACT GTC AGT AAG GGT 864 

Lys Gly Ala Val Val Thr Ala Asp Arg He Val Thr Val Ser Lys Gly 
325 330 335 

TAT TCG TGG GAG GTC ACA ACT GCT GAA GGT GGA CAG GGC CTC AAT GAG 912 
Tyr Ser Trp Glu Val Thr Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu 
40 340 345 350 355 

CTC TTA AGC TCC AGA AAG AGT GTA TTA AAC GGA ATT GTA AAT GGA ATT 960 
Leu Leu Ser Ser Arg Lys Ser Val Leu Asn Gly He Val Asn Gly He 
360 365 370 

GAC ATT AAT GAT TGG AAC CCT GCC ACA GAC AAA TGT ATC CCC TGT CAT 1008 
45 Asp He Asn Asp Trp Asn Pro Ala Thr Asp Lys Cys He Pro Cys His 

375 380 385 

TAT TCT GTT GAT GAC CTC TGT GGA AAG GCC AAA TGT AAA GGT GCA TTG 1056 
Tyr Ser Val Asp Asp Leu Ser Gly Lys Ala Lys Cys Lys Gly Ala Leu 
390 395 400 
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CAG AAG GAG CTG GGT TTA OCT ATA AGG OCT GAT GTT OCT CTG ATT GGC 1104 
Gin Lys Glu Leu Gly Leu Pro lie Arg Pro Asp Val Pro Leu lie Gly 
405 410 415 

TTT ATT GGA AGG TTG GAT TAT CAG AAA GGC ATT GAT CTC ATT CAA CTT 1152 
5 Phe lie Gly Arg Leu Asp Tyr Gin Lys Gly lie Asp Leu lie Gin Leu 
420 425 430 435 

ATC ATA CCA GAT CTC ATG CGG GAA GAT GTT CAA TTT GTC ATG CTT GGA 1200 
lie He Pro Asp Leu Met Arg Glu Asp Val Gin Phe Val Met Leu Gly 
440 445 450 

10 TCT GGT GAC CCA GAG CTT GAA GAT TGG ATG AGA TCT ACA GAG TCG ATC 1248 

Ser Gly Asp Pro Glu Leu Glu Asp Tro Met Arg Ser Thr Glu Ser He 
455 460 465 

TTC AAG GAT AAA TTT CGT GGA TGG GTT GGA TTT AGT GTT CCA GTT TCC 1296 
Phe Lys Asp Lys Phe Arg Gly Trp Val Gly Phe Ser Val Pro Val Ser 
15 470 475 480 

CAC CGA ATA ACT GCC GGC TGC GAT ATA TTG TTA ATG CCA TCC AGA TTC 1344 
His Arg lie Thr Ala Gly Cys Asp He Leu Leu Met Pro Ser Arg Phe 
485 490 495 

GAA CCT TGT GGT CTC AAT CAG CTA TAT GCT ATG CAG TAT GGC ACA GTT 1392 
20 Glu Pro Cys Gly Leu Asn Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val 
500 505 510 515 

CCT GTT GTC CAT GCA ACT GGG GGC CTT AGA GAT ACC GTG GAG AAC TTC 1440 
Pro Val Val His Ala Thr Giy Gly Leu Arg Asp Thr Val Glu Asn Phe 
520 525 530 

25 AAC CCT TTC GGT GAG AAT GGA GAG CAG GGT ACA GGG TGG GCA TTC GCA 1488 

Asn Pro Phe Gly Glu Asn Gly Glu Gin Gly Thr Gly Trp Ala Phe Ala 
535 540 545 

CCC CTA ACC ACA GAA AAC ATG TTT GTG GAC ATT GCG AAC TGC AAT ATC 1536 
Pro Leu Thr Thr Glu Asn Met Phe Val Asp He Ala Asn Cys Asn He 
30 550 555 560 

TAC ATA CAG GGA ACA CAA GTC CTC CTG GGA AGG GCT AAT GAA GCG AGG 1584 
Tyr He Gin Gly Thr Gin Val Leu Leu Gly Arg Ala Asn Glu Ala Arg 
565 570 575 

CAT GTC AAA AGA CTT CAC GTG GGA CCA TGC CGC TGA 1620 
35 His Val Lys Arg Leu His Val Gly Pro Cys Arg * 
580 585 590 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 540 amino acids 
40 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Cys Val Ala Glu Leu Ser Arg Glu Asp Leu Gly Leu Glu Pro Glu Gly 
45 1 5 10 15 

He Ala Glu Gly Ser He Asp Asn Thr Val Val Val Ala Ser Glu Gin 
20 25 30 

Asp Ser Glu He Val Val Gly Lys Glu Gin Ala Arg Ala Lys Val Thr 
35 40 45 
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Gin Ser lie Val Phe Val Thr Gly Glu Ala Ser Pro Tyr Ala Lys Ser 
50 55 60 

Gly Gly Leu Gly Asu Val Cys Gly Ser Leu Pro Val Ala Leu Ala Ala 
65 '70 75 80 

5 Arg Gly His Arg Val Met Val Val Met Pro Arg Tyr Leu Asn Gly Thr 

85 90 95 

Ser Asp Lys Asn Tyr Ala Asn Ala Phe Tyr Thr Glu Lys His lie Arg 
100 105 110 

lie Pro Cys Phe Gly Gly Glu His Glu Val Thr Phe Phe His Glu Tyr 
10 115 120 125 

Arg Asp Ser Val Asp Trp Val Phe Val Asp His Pro Ser Tyr His Arg 
130 135 - 140 

Pro Gly Asn Leu Tyr Gly Asp Lys Phe Gly Ala Phe Gly Asp Asn Gin 
145 150 155 160 

15 Phe Arg Tyr Thr Leu Leu Cvs Tyr Ala Ala Cys Glu Ala Pro Leu lie 

165 170 175 

Leu Glu Leu Gly Gly Tyr lie Tyr Gly Gin Asn Cys Met Phe Val Val 
180 185 190 

Asn Asp Trp His Ala Ser Leu Val Pro Val Leu Leu Ala Ala Lys Tyr 
20 195 200 205 

Arg Pro Tyr Gly Val Tyr Lys Asp Ser Arg Ser He Leu Val He His 
210 215 " 220 



Asn Leu Ala His Gin Gly Val Glu Pro Ala Ser Thr Tyr Pro Asp Leu 
225 230 235 240 

25 Gly Leu Pro Pro Glu Trp Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu 

245 250 255 

Trp Ala Arg Arg His Ala Leu Aso Lys Gly Glu Ala Val Asn Phe Leu 
260 265 270 

Lys Gly Ala Val Val Thr Ala Aso Arg He Val Thr Val Ser Lys Gly 
30 275 280 285 

Tyr Ser Trp Glu Val Thr Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu 
290 295 300 

Leu Leu Ser Ser Arg Lys Ser Val Leu Asn Gly He Val Asn Gly He 
305 310 315 320 

35 Asp He Asn Asp Trp Asn Pro Ala Thr Asp Lys Cys He Pro Cys His 

325 330 335 

Tyr Ser Val Asp Asp Leu Ser Gly Lys Ala Lys Cys Lys Gly Ala Leu 
340 345 350 

Gin Lys Glu Leu Gly Leu Pro He Arg Pro Asp Val Pro Leu He Gly 
40 355 360 365 

Phe He Gly Arg Leu Asp Tyr Gin Lys Gly He Asp Leu He Gin Leu 
370 375 380 

He He Pro Asp Leu Met Arg Glu Aso Val Gin Phe Val Met Leu Gly 
385 390 ' 395 400 

45 Ser Gly Asp Pro Glu Leu Glu Asp Trp Met Arg Ser Thr Glu Ser He 

405 410 415 
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Phe Lys Asp Lys Phe Arg Gly Trp Val Gly Phe Ser Val Pro Val Ser 
420 425 430 

His Arg He Thr Ala Gly Cys Asp He Leu Leu Met Pro Ser Arg Phe 
435 440 445 

5 Glu Pro Cys Gly Leu Asn Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val 
450 455 460 

Pro Val Val His Ala Thr Gly Gly Leu Arg Asp Thr Val Glu Asn Phe 
465 470 475 480 

Asn Pro Phe Gly Glu Asn Gly Glu Gin Gly Thr Gly Trp Ala Phe Ala 
10 485 490 495 

Pro Leu Thr Thr Glu Asn Met Phe Val Asp He Ala Asn Cys Asn lie 
500 505 510 

Tyr He Gin Gly Thr Gin Val Leu Leu Gly Arg Ala Asn Glu Ala Arg 
515 520 525 

15 His Val Lys Arg Leu His Val Gly Pro Cys Arg * 
530 535 540 

Example Six: 

This experiment employs a plasmid having a maize promoter, a maize transit peptide, 
a starch-encapsulating region from the starch synthase I gene, and a ligated gene fragment 
20 attached thereto. The plasmid shown in FIG. 6 contains the DNA sequence listed in Table 8. 

Plasmid pEXS52 was constructed according to the following protocol: 

Materials used to construct transgenic plasmids are as follows: 

Plasmid pBluescript SK- 
Plasmid pMF6 (contain nos3' terminator) 
25 Plasmid pHKHl (contain maize adhl intron) 

Plasmid MstsI(6-4) (contain maize stsi transit peptide, use as a template for PCT stsi transit 

peptide out) 
Plasmid MstsIII in pBluescript SK- 

Primers EXS29 (GTGGATCCATGGCGACGCCCTCGGCCGTGG) [SEQ ID NO:22] 
30 EXS35 (CTGAATTCCATATGGGGCCCCTCCCTGCTCAGCTC) [SEQ ID NO:23] 

both used for PCT stsi transit peptide 
Primers EXS31 (CTCTGAGCTCAAGCTTGCTACTTTCTTTCCTTAATG) [SEQ ID NO:24] 
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EXS32 (GTCTCCGCGGTGGTGTCCTTGCTTCCTAG) [SEQ ID NO:25] 
both used for PGR maize lOKD zein promoter (Journal: Gene 71:359-370 [1988]) 
Maize A632 genomic DNA (used as a template for PGR maize lOKD zein promoter). 

Step 1: Clone maize lOKD zein promoter in pBluescriptSK-(named as pEXSlOzp). 

1. PGR 1.1Kb maize lOKD zein promoter 
primers: EXS31, EXS32 

template: maize A632 genomic DNA 

2. Clone LlKb maize, lOKD zein promoter PGR product into pBluescript SK- 
plasmid at Sad and SacII site (See FIG. 7). 

Step 2: Delete Ndel site in pEXSlOzp (named as pEXSIOzp-Ndel). 

Ndel is removed by fill in and blunt end ligation from maize lOKD zein promoter in 
pBluescriptSK. 

Step 3: Clone maize adhl intron in pBluescriptSK- (named as pEXSadhl). 

Maize adhl intron is released from plasmid pHKHl at Xbal and BamHI sites. Maize 
adhl intron (Xbal/BamHI fragment) is cloned into pBluescriptSK- at Xbal and BamHI 
sites (see FIG. 7). 

Step 4: Clone maize lOKD zein promoter and maize adhl intron into pBluescriptSK- 
(named as pEXSlOzp-adhl). 

Maize lOKD zein promoter is released from plasmid pEXS lOzp-Ndel at Sad and 
SacII sites. Maize lOKD zein promoter (SacI/SacII fragment) is cloned into plasmid 
pEXSadhl (contain maize adhl intron) at Sad and SacII sites (see FIG. 7). 
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Step 5: Clone maize nos3* terminator into plasmid pEXSadhl (named as pEXSadhl- 
nos3*). 



Maize nos3' terminator is released from plasmid pMF6 at EcoRI and Hindlll sites. 
Maize nos3' terminator (EcoRI/Hindlll fragment) is cloned into plasmid pEXSadhl at 
5 EcoRI and Hindlll (see FIG. 7). 

Step 6: Clone maize nos3' terminator into plasmid pEXSlOzp-adhl (named as 
pEXS10zp-adhl-nos3'). 

Maize nos3' terminator is released from plasmid pEXSadhl -nos3' at EcoRI and Apal 
sites. Maize nos3' terminator (EcoRI/ Apal fragment) is cloned into plasmid 
10 pEXSlOzp-adhl at EcoRI and Apal sites (see FIG. 7). 



Step 7: Clone maize STSI transit peptide into plasmid pEXS10zp-adhl-nos3' (named as 
pEXS33), 

1. PGR 150bp maize STSI transit peptide 
primer: EXS29, EXS35 

15 template: MSTSI(6-4) plasmid 

2. Clone 150bp maize STSI transit peptide PGR product into plasmid pEXSlOzp- 
adhl-nos3' at EcoRI and BamHI sites (see FIG. 7). 



Step 8: Site-directed mutagenesis on maize STSI transit peptide in pEXS33 (named as 
pEXS33(m)). 



20 There is a mutation (stop codon) on maize STSI transit peptide in plasmid pEXS33. 

Site-directed mutagenesis is carried out to change stop codon to non-stop codon. New 
plasmid (containing maize lOKD zein promoter, maize STSI transit peptide, maize 
adhl intron, maize nos3' terminator) is named as pEXS33(m). 
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Step 9: Noa site in pEXS33(m) deleted (named as pEXS50). 

NotI site is removed from pEXS33 by NotI fillin, blunt end ligation to form pEXSSO 
(see FIG. 8). 

Step 10: Maize adhl intron deleted in pEXS33(m) (named as pEXS60). 

5 Maize adhl intron is removed by Notl/BamHI digestion, filled in with Klenow 

fragment, blunt end ligation to form pEXS60 (see FIG. 9). 

Step 11: Clone maize STSIII into pEXS50, pEXS60. 

Maize STSIII is released from plasmid maize STSIII in pBluescript SK- at Ndel and 
EcoRI sites. Maize STSIII (Ndel-EcoRI fragment) is cloned into pEXSSO, pEXS60 
10 separately, named as pEXSSl, pEXS61 (see FIGS. 8 and 9, respectively). 

Step 12: Clone the gene in Table 8 into pEXSSI at Ndel/NotI site to form pEXS52. 

Other similar plasmids can be made by cloning other genes (STSI, II, WX, 
glgA, glgB, glgC, BEI, BEII, etc.) into pEXSSI, pEXS61 at Ndel/NotI site. 

Plasmid EXS52 was transformed into rice. The regenerated rice plants transformed 
15 with pEXS52 were marked and placed in a magenta box. 

Two siblings of each line were chosen from the magenta box and transferred into 2.5 
inch pots filled with soil mix (topsoil mixed with peat-vermiculite 50/50). The pots were 
placed in an aquarium (fish tank) with half an inch of water. The top was covered to 
maintain high humidity (some holes were made to help heat escape). A thermometer 
20 monitored the temperature. The fish tank was placed under fluorescent lights. No fertilizer 
was used on the plants in the first week. Light period was 6 a.m. -8 p.m., minimum 14 hours 
light. Temperature was minimum 68°F at night, 80''-90''F during the day. A heating mat 
was used under the fish tank to help root growth when necessary. The plants stayed in the 
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above condirion for a week. (Note: the seedlings began to grow tall because of low light 
intensity.) 

After the first week, the top of the aquarium was opened and rice transformants were 
transferred to growth chambers for three weeks with high humidity and high light intensity. 

5 Alternatively, water mix. in the greenhouse can be used to maintain high humidity. 

The plants grew for three weeks. Then the plants were transferred to 6-inch pots (minimum 
5-inch pots) with soil mix (topsoil and peat- Vet, 50/50). The pots were in a tray filled with 
half an inch of water. 15-16-17 (N-K-P) was used to fertilize the plants (250 ppm) once a 
week or according to the plants* needs by their appearances. The plants remained in 14 hours 
10 light (minimum) 6 a.m. -8 p.m. high light intensity, temperature 85''-90°/70**F day/night. 

The plants formed rice grains and the rice grains were harvested. These harvested 
seeds can have the starch extracted and analyzed for the presence of the ligated amino acids 
C, V, A, E, L, S, k, E [SEQ ID NO:27] in the starch within the seed. 

Example Seven: 

15 SER Vector for Plants: 

The plasmid shown in Figure 6 is adapted for use in monocots, i.e., maize. Plasmid 
pEXS52 (FIG. 6) has a promoter, a transit peptide (from maize), and a ligated gene fragment 
(TGC GTC GCG GAG CTG AGC AGG GAG) [SEQ ID NO:26] which encodes the amino 
acid sequence CVAELSRE [SEQ ID NO:27]. 

20 This gene fragment naturally occurs close to the N-terminal end of the maize soluble 

starch synthase (MSTSI) gene. As is shown in TABLE 8, at about amino acid 292 the SER . 
from the starch synthase begins. This vector is preferably transformed into a maize host. 
The transit peptide is adapted for maize so this is the preferred host. Clearly the transit 
peptide and the promoter, if necessary, can be altered to be appropriate for the host plant 

25 desired. After transformation by "whiskers" technology (U.S. Patent Nos. 5,302,523 and 
5,464,765), the transformed host cells are regenerated by methods known in the art, the 
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transformant is pollinated, and the resultant kernels can be collected and analyzed for the 
presence of the peptide in the starch and the starch granule. 

The following preferred genes can be employed in maize to improve feeds: phytase 
gene, the somototrophin gene, the following chained amino acids: AUG AUG AUG AUG 
AUG AUG AUG AUG [SEQ ID NO:28]; and/or, AAG AAG AAG AAG AAG AAG AAG 
AAG AAG AAG AAG AAG {SEQ ID NO:29]; and/or AAA AAA AAA AAA AAA AAA 
[SEQ ID NO: 30]; or a combination of the codons encoding the lysine amino acid in a chain 
or a combination of the codons encoding -both lysine and the methionine codon or any 
combination of two or three of these amino acids. The length of the chains should not be 
unduly long but the length of the chain does not appear to be critical. Thus the amino acids 
will be encapsulated within the starch granule or bound within the starch formed in the starch- 
bearing portion of the plant host. 

This plasmid may be transformed into other cereals such as rice, wheat, barley, oats, 
sorghum, or millet with little to no modification of the plasmid. The promoter may be the 
wary gene promoter whose sequence has been published, or other zein promoters known to 
the art. 

Additionally these plasmids, without undue experimentation, may be transformed into 
dicots such as potatoes, sweet potato, taro, yam, lotus cassava, peanuts, peas, soybean, beans, 
or chickpeas. The promoter may be selected to target the starch-storage area of particular 
dicots or tubers, for example the patatin promoter may be used for potato tubers. 

Various methods of transforming monocots and dicots are known in the industry and 
the method of transforming the genes is not critical to the present invention. The plasmid can 
be introduced into Agrobacterium tumefaciens by the freeze-thaw method of An et al. (1988) . 
Binary Vectors, in Plant Molecular Biology Manual A3, S.B. Gelvin and R.A. Schilperoot, 
eds. (Dordrecht, The Netherlands: Kluwer Academic Publishers), pp. 1-19. Preparation of 
Agrobacterium inoculum carrying the construct and inoculation of plant material, regeneration 
of shoots, and rooting of shoots are described in Edwards et al., "Biochemical and molecular 
characterization of a novel starch synthase from potatoes," Plant J. 8, 283-294 (1995). 
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A number of encapsulating regions are present in a number of different genes. . 
Although it is preferred that the protein be encapsulated within the starch granule (granule 
encapsulation), encapsulation within non-granule starch is also encompassed within the scope 
of the present invention in the term "encapsulation." The following types of genes are useful 
for this purpose. 

Use of Starch-Encapsulating Regions of Glycogen Synthase: 

E. co// glycogen synthase is not a large protein: the structural gene is 1431 base pairs 
in length, specifying a protein of 477 amino acids with an estimated molecular weight of 
49,000. It is known that problems of codon usage can occur with bacterial genes inserted into 
plant genomes but this is generally not so great with £. coli genes as with those from other 
bacteria such as those from Bacillus, Glycogen synthase from E. coli has a codon usage 
profile much in common with maize genes but it is preferred to alter, by known procedures, 
the sequence at the translation start point to be more compatible with a plant consensus 
sequence: 

glgA GATAATGCAG [SEQ ID N0:31] 
cons AACAATGGCT [SEQ ID NO:32] 

Use of Starch-Encapsulating Regions of Soluble Starch Synthase: 

cDNA clones of plant-soluble starch synthases are described in the background section 
above and can be used in the present invention. The genes for any such SSTS protein may be 
used in constructs according to this invention. 

Use of Starch-Encapsulating Regions of Branching Enzyme: 

cDNA clones of plant, bacterial and animal branching enzymes are described in the 
background section above can be used in the present invention. Branching enzyme 
[l,4Dglucan: l,4Dglucan 6D(l,4Dglucano) transferase (E.G. 2.4.1.18)] converts amylose to 
amylopectin, (a segment of a l,4Dglucan chain is transferred to a primary hydroxyl group in 
a similar glucan chain) sometimes called Q-enzyme. 

The sequence of maize branching enzyme I was investigated by Baba et al. (1991) 
BBRC, 181:87-94. Starch branching enzyme II from maize endosperm was investigated by 
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Fisher et al. (1993) Plant Physiol, 102:1045-1046, The BE gene construct may require the 
presence of an amyloplast transit peptide to ensure its correct localization in the amyloplast. 
The genes for any such branching enzyme of GBSTS protein may be used in constructs 
according to this invention. 

5 Use of Starch-Binding Domains of Granule-Bound Starch Synthase; 

The use of cDNA clones of plant granule-bound starch synthases are described in 
Shure et al. (1983) Cell 35:225-233, and Visser et al. (1989) Plant Sci. 64(2): 185-192. 
Visser et al. have also described the inhibition of the expression of the gene for granule-bound 
starch synthase in potato by antisense constructs (1991) Mol. Gen. Genetic 225(2): 289-296; 

10 (1994) The Plant Cell 6:43-52.) Shimada et al. show antisense in rice (1993) Theor. AppL 

Genet. 86:665-672. Van der Leij et al. show restoration of amylose synthesis in low-amylose 
potato following transformation with the wild-type waxy potato gene (1991) Theor. Appl. 
Genet, 82:289-295. 



The amino acid sequences and nucleotide sequences of granule starch synthases from, 
15 for example, maize, rice, wheat, potato, cassava, peas or barley are well known. The genes 
for any such GBSTS protein may be used in constructs according to this invention. 



Construction of Plant Transformation Vectors: 

Plant transformation vectors for use in the method of the invention may be constructed 
using standard techniques 

20 

Use of Transit Peptide Sequences: 

Some gene constructs require the presence of an amyloplast transit peptide to ensure 
correct localization in the amyloplast. It is believed that chloroplast transit peptides have 
similar sequences (Heijne et al. describe a database of chloroplast transit peptides in (1991) 
25 Plant Mol. Biol. Reporter, 9(2): 104-126). Other transit peptides useful in this invention are 
those of ADPG pyrophosphorylase (1991) Plant Mol. Biol. Reporter, 9:104-126), small 
subunit RUBISCO, acetolactate synthase, glyceraldehyde3Pdehydrogenase and nitrite 
reductase. 
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The consensus sequence of the transit peptide of small subunit RUBISCO from many 
genotypes has the sequence: 



MASSMLSSAAVATRTNPAQASM VAPFTGLKSAAFPVSRKQNLDI TSIASNGGRVQC 
[SEQ ID NO:33] 

5 The com small subunit RUBISCO has the sequence: 

MAPTVMMASSATATRTNPAQAS AVATFQGLKSTASLPVARRSSR SLGNVASNGGRIRC 
[SEQ ID NO:34] 

The transit peptide of leaf glyceraldehydeSPdehydrogenase from corn has the 
sequence: 

1 0 MAQILAPSTQWQMRITKTSPC A TPITSKMWSSLVMKQTKKV AHS 
AKFRVMAVNSENGT [SEQ ID NO:35] 

The transit peptide sequence of corn endosperm-bound starch synthase has the 
sequence: 

MAALATSQLVATRAGHGVPDASTFRRGAAQGLRGARASAAADTLSMRTSARAAPRHQ 
15 QQARRGGRFPFPSLVVC [SEQ ID NO:36] 

The transit peptide sequence of corn endosperm soluble starch synthase has the 
sequence: 

MATPSAVGAACLLLARXAWPAAVGDRARPRRLQRVLRRR [SEQ ID NO:37] 

Engineering New Amino Acids or Peptides into Starch-Encapsulating Proteins: 

20 The starch-binding proteins used in this invention may be modified by methods known 

to those skilled in the art to incorporate new amino acid combinations. For example, 
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sequences of starch-binding proteins may be modified to express higher-than-normal levels of 
lysine, methionine or tryptophan. Such levels can be usefully elevated above natural levels 
and such proteins provide nutritional enhancement in crops such as cereals. 

In addition to altering amino acid balance, it is possible to engineer the starch-binding 
5 proteins so that valuable peptides can be incorporated into the starch-binding protein. 

Attaching the payload polypeptide to the starch-binding protein at the N-terminal end of the 
protein provides a known means of adding peptide fragments and still maintaining starch- 
binding capacity. Further improvements can be made by incorporating specific protease 
cleavage sites into the site of attachment of the payload polypeptide to the starch-encapsulating 
10 region. It is well known to those skilled in the art that proteases have preferred specificities 
for different amino-acid linkages. Such specificities can be used to provide a vehicle for 
delivery of valuable peptides to different regions of the digestive tract of animals and man. 

In yet another embodiment of this invention, the payload polypeptide can be released 
following purification and processing of the starch granules. Using amylolysis and/or 
15 gelatinization procedures it is known that the proteins bound to the starch granule can be 
released or become available for proteolysis. Thus recovery of commercial quantities of 
proteins and peptides from the starch granule matrix becomes possible. 

In yet another embodiment of the invention it is possible to process the starch granules 
in a variety of different ways in order to provide a means of altering the digestibility of the 
20 starch. Using this methodology it is possible to change the bioavailablility of the proteins, 
peptides or amino acids entrapped within the starch granules. 

Although the foregoing invention has been described in detail by way of illustration 
and example for purposes of clarity and understanding, it will be readily apparent to those of 
ordinary skill in the art in light of the teachings of this invention that certain changes and 
25 modifications may be made thereto without departing from the spirit or scope of the appended 
claims. 
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